MATH36031 Project 3 - deadline 16th December 2022, time 1100hrs.
For this project you need will need to download the bananas.csv file on Blackboard, and it
will located in the Projects folder in the Project 3 section. This is a very large datafile with
almost 12000 entries, so do not try print the file! The first few lines of the file are as shown in
figure 1. The Origin header denotes the country of origin of the bananas. The Date header
Figure 1: The first few lines of the bananas.csv file.
shows the date for the data listed, the Price header column shows the cost in pounds sterling
for a unit of the product recorded on that date. The Units header indicates the unit (such
as pounds per kilo gr a m s in the above sample).
You need t o process the fil e using MATL AB to answer th e following questions:
1. From the data produce a list of all the distinct entries under the Origin header and
check how many distinct Units there are.
2. The price of the ban an as fluctuate a lot duri n g the year. For each name listed und er
the Origin header find the mean of that variety.
3. Produce a grouped box plot comparing the variation of the prices of the di↵erent
varieties of bananas with Origin:
‘colombia’, ‘costa rica’, ‘dominican republic’, ‘honduras’, ‘jamaica’, ‘windward isles’,
‘mexico’
and comment on your results.
4. Taking the time series for the variety with Origin ‘colombia’, analyze the time series
and comment on any seasonal trends.
5. Use the corrcoef funct i on to calculate the correlation coefficients between the fluctu-
ation of prices for the variety with Origin ‘colombia’ and ‘costa rica’.
Outputs required You are required to submit a report (maximum 8 pages inc l u di n g
any appendices) in pdf form via the submission box on Blackboard. Additionally you need
to submit your m-files used for the MATLAB codes in answering the above questions.
1