[an error occurred while processing this directive]
[an error occurred while processing this directive] In recent years the world has seen an explosion in the quantity and variety of data routinely recorded and analysed by research and industry, prompting some social commentators to refer to this phenomenon as the rise of "big data," and the analysts and practitioners who investigate the data as "data scientists."
The data may come from a variety of sources, including scientific experiments and measurements, or may be recorded from human interactions such as browsing data or social networks on the Internet, mobile phone usage or financial transactions. Many companies too, are realising the value of their data for analysing customer behaviour and preferences, recognising patterns of behaviour such as credit card usage or insurance claims to detect fraud, as well as more accurately evaluating risk and increasing profit.
In order to obtain insights from big data new analytical techniques are required by practitioners. These include computationally intensive and interactive approaches such as visualisation, clustering and data mining. The management and processing of large data sets requires the development of enhanced computational resources and new algorithms to work across distributed computers.
This unit will introduce students to the analysis and management of big data using current techniques and open source and proprietary software tools. Data and case studies will be drawn from diverse sources including health and informatics, life sciences, web traffic and social networking, business data including transactions, customer traffic, scientific research and experimental data. The general principles of analysis, investigation and reporting will be covered. Students will be encouraged to critically reflect on the data analysis process within their own domain of interest.
2 hrs lectures/wk, 2 hrs laboratories/wk
Lectures: 2 hours per week
Tutorials/Lab Sessions: 2 hours per week per tutorial
and up to an additional 8 hours in some weeks for completing lab and project work, private study and revision.
FIT1006, ETC1000 or equivalent. (For example BUS1100, ETC1010, ETC2010, ETF2211, ETW1000, ETW1010, ETW1102, ETW2111, ETX1100, ETX2111, ETX2121, MAT1097, STA1010)
Dr John Betts
Dr Sue Bedingfield
Mr Rj Chow
Dr Kefeng (Jason) Xuan
Week | Activities | Assessment |
---|---|---|
0 | No formal assessment or activities are undertaken in week 0 | |
1 | Introduction to Data Science. Introduction to R. Review of basic statistics using R | Tutorial Participation assessed Weekly |
2 | Exploring data using graphics in R | |
3 | Analytics and modelling in R | |
4 | Data cleansing, consulting, case studies. (Guest Lecture) | |
5 | Programming in R | Group Assignment (Initial report) due 30 August 2013 |
6 | Classification using decision trees | |
7 | Comparing classification models, ensemble techniques | |
8 | K-Means and hierarchical clustering | |
9 | Text analysis | |
10 | Scalable algorithms. Map Reduce | Individual Assignment due 11 October 2013 |
11 | Student Presentations | Students will give a brief presentation of their group project results. Group Assignment (Final report) due 18 October 2013 |
12 | Review of the course and exam preparation | |
SWOT VAC | No formal assessment is undertaken in SWOT VAC | |
Examination period | LINK to Assessment Policy: http://policy.monash.edu.au/policy-bank/ academic/education/assessment/ assessment-in-coursework-policy.html |
*Unit Schedule details will be maintained and communicated to you via your learning system.
Examination (2 hours): 60%; In-semester assessment: 40%
Assessment Task | Value | Due Date |
---|---|---|
Group Assignment | 20% | Initial report due 30 August 2013, Final report due 18 October 2013 |
Individual Assignment | 10% | 11 October 2013 |
Tutorial Participation | 10% | Weekly |
Examination 1 | 60% | To be advised |
Faculty Policy - Unit Assessment Hurdles (http://www.infotech.monash.edu.au/resources/staff/edgov/policies/assessment-examinations/unit-assessment-hurdles.html)
Academic Integrity - Please see the Demystifying Citing and Referencing tutorial at http://lib.monash.edu/tutorials/citing/
As this is a group project, students in each group will allocate a weighting of the final results to each member of the group based on a consensus estimate of each member's contribution.
Monash Library Unit Reading List
http://readinglists.lib.monash.edu/index.html
Submission must be made by the due date otherwise penalties will be enforced.
You must negotiate any extensions formally with your campus unit leader via the in-semester special consideration process: http://www.monash.edu.au/exams/special-consideration.html
As per Faculty policy (referencing for Master coursework and undergraduate - see http://intranet.monash.edu.au/infotech/resources/staff/edgov/policies/units/style-masters-ug-degrees.html), the Unit Guide will include links to the relevant referencing requirements for the unit.
It is a University requirement (http://www.policy.monash.edu/policy-bank/academic/education/conduct/plagiarism-procedures.html) for students to submit an assignment coversheet for each assessment item. Faculty Assignment coversheets can be found at http://www.infotech.monash.edu.au/resources/student/forms/. Please check with your Lecturer on the submission method for your assignment coversheet (e.g. attach a file to the online assignment submission, hand-in a hard copy, or use an online quiz). Please note that it is your responsibility to retain copies of your assessments.
W. N. Venables, D. M. Smith. (2013). An Introduction to R. () Available from: http://www.cran.r-project.org/doc/manuals/R-intro.pdf.
M. Allerhand. (2011). A tiny handbook of R. () SpringerLink (Online service), Online access via Library.
Pang-Ning Tan, Michael Steinbach, Vipin Kumar. (2006). Introduction to data mining. () Addison-Wesley.
Luis Torgo. (2011). Data mining with R: learning with case studies. () Chapman & Hall CRC.
Monash has educational policies, procedures and guidelines, which are designed to ensure that staff and students are aware of the University’s academic standards, and to provide advice on how they might uphold them. You can find Monash’s Education Policies at: www.policy.monash.edu.au/policy-bank/academic/education/index.html
Key educational policies include:
The University provides many different kinds of support services for you. Contact your tutor if you need advice and see the range of services available at http://www.monash.edu.au/students. For Sunway see http://www.monash.edu.my/Student-services, and for South Africa see http://www.monash.ac.za/current/.
The Monash University Library provides a range of services, resources and programs that enable you to save time and be more effective in your learning and research. Go to www.lib.monash.edu.au or the library tab in my.monash portal for more information. At Sunway, visit the Library and Learning Commons at http://www.lib.monash.edu.my/. At South Africa visit http://www.lib.monash.ac.za/.
For more information on Monash’s educational strategy, see:
www.monash.edu.au/about/monash-directions and on student evaluations, see: www.policy.monash.edu/policy-bank/academic/education/quality/student-evaluation-policy.html
This is a new unit.
If you wish to view how previous students rated this unit, please go to
https://emuapps.monash.edu.au/unitevaluations/index.jsp