Note from the [Editor](http://clarecorthell.org): `Take Two`
<sup>In the old days of 2013, the OSDSM was born. Then, there were "little to no Data Scientists with 5 years experience, because the job simply did not exist." (_David Hardtke, Nov 2012_) Since then, history has witnessed many things, including:</sup>
<sup>⢠Data Scientists working across industries and the world</sup><br />
<sup>⢠social media manipulation disrupts many elections</sup><br />
<sup>⢠BLM and #metoo and Extinction Rebellion and many other social movements</sup><br />
<sup>⢠machine learning begins falling under engineering domain</sup><br />
<sup>⢠a pandemic</sup><br />
<sup>⢠climate change disasters becoming very frequent while climate warms faster than predicted</sup><br />
<sup>⢠remote work becoming common</sup>
<sup>⢠multiple global recession shocks</sup>
<sup>In that decade, Data Science has seen growth of jobs, shortfall of goals, success in many industries, abject failure in others, and nefarious use cases. In particular, [adverse consequences and complications of learning from data](http://machinebias.org) appear in too many examples: [elections undermined by psychographics](https://en.wikipedia.org/wiki/Facebook%E2%80%93Cambridge_Analytica_data_scandal), [dismal gender (Men=74%) and BIPOC diversity in the AI field](https://www.nytimes.com/2016/06/26/opinion/sunday/artificial-intelligences-white-guy-problem.html), a revived [eugenics](https://www.technologyreview.com/s/612275/sociogenomics-is-opening-a-new-door-to-eugenics/), an [explainability crisis](https://hbr.org/2018/07/when-is-it-important-for-an-algorithm-to-explain-itself), [facial recognition](https://www.theguardian.com/technology/2014/may/04/facial-recognition-technology-identity-tesco-ethical-issues) used to [identify people](https://onlinelibrary.wiley.com/doi/abs/10.1002/widm.1278) and systematically [detain them](https://www.buzzfeednews.com/article/meghara/china-new-internment-camps-xinjiang-uighurs-muslims), ["aggression" detection microphones in schools](https://features.propublica.org/aggression-detector/the-unproven-invasive-surveillance-technology-schools-are-using-to-monitor-students/), and many others. It has never been more clear that **we need to talk about the real world impacts of our work, and consider how our creations are used.** As you consider this, read a prescient [novel](https://library.oapen.org/bitstream/id/24cb1da5-a512-4de1-b24c-639b6452dbec/628778.pdf) that grapples with the consequences of birthing, of creation, of technology.</sup>
<sup>Like any tool, data-driven technologies are indifferent to the morality of their ends. Perhaps the greatest risk of all is leaving this tool in the hands of the few expensively-educated people who cannot possibly represent all of us. To balance this, open source movements seek to lower the barriers to education for everyone. Data science and data literacy must be widespread, accessible, and leveraged for building our collective future. More than ever, we need that future to be built by members of society who are diverse and focused on generative, sustainable, resilient, [emergent](https://bookshop.org/a/2958/9781849352604) solutions. After all, the things we build are mirrors of ourselves (seriously, read Shelley's [Frankenstein](https://library.oapen.org/bitstream/id/24cb1da5-a512-4de1-b24c-639b6452dbec/628778.pdf)).</sup>
> <sup>_Computers reflect the biases and belief systems of the people programming them_ -[@alicegoldfuss](https://twitter.com/alicegoldfuss/status/1016034359134941184)</sup>
<sup>The OSDSM is built with the belief that **open source education makes a diverse, collective, generative future-building possible.** I hope that you are one of the next people -- whether you call yourself a Data Scientist or not -- to help make better decisions with the scientific process, critical thinking, and everything else your unique perspective brings to the table. This rewritten curriculum focuses on what is needed to be successful in the entry-level role, but that is just a generic outline; truly, I hope where you take it extends far beyond that.</sup>
***
Start here ð
## The Open Source Data Science Masters
The open-source curriculum for learning to be a Data Scientist. Curriculum resources from both universities and working Data Scientists focuses on foundational theory and applied skills. The OSDSM is collectively-maintained and open to PRs.
The goal of this curriculum is to prepare the student for an entry level Data Scientist role, using open source materials, at no cost but with the same calibur of materials found in the most reputable paid programs. Books not offered for free are often available through a public library, also indicated here with current list price. The Masters is self-guided and self-accredited. To better support credibility, the structure now includes a Capstone project intended to demonstrate the student's problem solving approach, skills in execution, and communication. Upon completion, the student can award oneself a [Credential](https://help.accredible.com/add-your-credential-to-linkedin) on LinkedIn from the Open Source Data Science Masters. As with all things, the OSDSM is best played as a team sport (try finding people on [r/learndatascience](https://www.reddit.com/r/learndatascience/)).
This is called a "Masters" because it is primarily concerned with "upper-level" college course material in mathematics, programming, economics, or related disciplines. Come as you are!
1. **ð The Core** - This is a critical foundation for what is to come; don't skip the foundational lessons.
2. **âï¸ Specialty** - Choose what is most interesting to you, or most relevant to the work you plan to do.
3. **ð¤ Doing Data Science** - Learn about how doing science with others and for businesses can work.
4. **ð§âð» Capstone Project** - Choose a meaningful project or dataset to demonstrate what you've learned.
## ð The Core
_This is a critical foundation for what is to come; don't skip!_
### What is Data Science?
One could argue that "Data Science" is a recent term for an already existing information analysis discipline. Humans instinctually search for patterns, a purpose we also see in this more digitized discipline. Read different sources (and search beyond this list) about the uses of data science.
- The Signal and The Noise / Nate Silver [Book ```$18```](https://bookshop.org/a/2958/9780143125082) -- Narrated cases of Data Science at play in the real world.
- Dataclysm: Who We Are (When We Think No One's Looking) / Christian Rudder [Book ```$17```](https://bookshop.org/a/2958/9780385347396) -- From the inside of OKCupid, real examples of how data science can illustrate human behavior.
- Informatics of the Oppressed / Rodrigo Ochigame [Logic Magazine](https://logicmag.io/care/informatics-of-the-oppressed/) -- _Algorithms of oppression have been around for a long time. So have radical projects to dismantle them and build emancipatory alternatives._
* A showcase of [Jupyter Python Data Analysis Notebooks](https://github.com/jupyter/jupyter/wiki) across disciplines.
### Foundations of Data Science
#### Problem Solving
When there are no answers in the back of the book, how do you proceed? Breaking down problems is a skill, one that can and should be learned. Follow Pólya's process, and for extra credit, seek out resources on computer science [decomposition](http://sites.fas.harvard.edu/~libs111/files/lectures/unit1-3.pdf).
* Problem-Solving Heuristics "How To Solve It" George Pólya [Berkeley / Summary](https://math.berkeley.edu/~gmelvin/polya.pdf) [Book ```$18```](https://bookshop.org/a/2958/9780691164076)
### The Scientific Process & Experimentation
It is crucial as a Data Scientist that you show integrity in and transparency of scientific process. Even if you've been here before, review and draw out the process diagram for the scientific method.
- [The Scientific Proces
没有合适的资源?快使用搜索试试~ 我知道了~
The-Open-Source-Data-Science-Masters-go.zip
共8个文件
md:7个
yml:1个
需积分: 5 0 下载量 51 浏览量
2024-08-29
12:51:40
上传
评论
收藏 25KB ZIP 举报
温馨提示
The_Open_Source_Data_Science_Masters_go.zip The_Open_Source_Data_Science_Masters_go.zip The_Open_Source_Data_Science_Masters_go.zipThe_Open_Source_Data_Science_Masters_go.zipThe_Open_Source_Data_Science_Masters_go.zipThe_Open_Source_Data_Science_Masters_go.zip The_Open_Source_Data_Science_Masters_go.zip The_Open_Source_Data_Science_Masters_go.zip The_Open_Source_Data_Science_Masters_go.zipThe_Open_Source_Data_Science_Masters_go.zip
资源推荐
资源详情
资源评论
收起资源包目录
The_Open_Source_Data_Science_Masters_go.zip (8个子文件)
go-master
LICENSE.md 1KB
.github
workflows
jekyll-gh-pages.yml 1KB
projects.md 4KB
transcripts
scott-davis-transcript.md 12KB
scott-davis-transcript.md 10KB
datasets.md 2KB
README.md 25KB
r-resources.md 3KB
共 8 条
- 1
资源评论
好家伙VCC
- 粉丝: 1920
- 资源: 9085
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功