-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Request for Comments: Data Science Curriculum v2 #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It might be helpful to have a direct link to the guidelines so it's easier to read and comment: Very short and nice read, just 16 pages. Looks like you already made all the necessary course changes in the pull request. All the links are alive and the courses cover the Key Competencies on Page 6, and the Six Main Subject Areas & Outline on Page 9 very well. Really excellent work! The topic progression path in the pull request looks somewhat different than the possible path in Figure 1, Page 12 of the guidelines. I think it's fine, but maybe you could explain a little bit for those that are curious? One question to which I do not see an immediate answer is, where would the "Capstone Experience" and "Course in an outside discipline" mentioned in the Outline on Page 9 come from? Are they contained in some of the courses in the curriculum? By the way I don't know much about data science at all, and I don't have a horse in this race. Just trying to be helpful. |
For suggested changes at #60 (comment) regarding Algorithms Part 1 and 2 These courses are completely in Java and have a steep learning curve from the start. Those coming straight from python/r/julia will have a hard time adjusting to both the course materials and programming syntax. Suggest an optional course on Java as pre-requisite, specifically Java Programming I and II by University of Helsinki. Gives college credit for Finland residents. |
I certainly think adding resources in Extras for teaching Java would be appropriate. As well, adding a note in the main curriculum that those resources are available. The University of Helsinki courses are high quality and I have no objection to listing them as a resource. One other option to keep in mind is Computer Science: Programming with Purpose. One thing to recommend this alternative is that it is taught by the same instructor as the Algorithm courses. This could be used instead of Introduction to Computer Science and Programming Using Python and Introduction to Computational Thinking and Data Science, or in addition to them. |
Yeah, that's not an easy choice. The MITx pair go well together as a series just like Sedgewick's series. On one hand, MITx uses python which is what most people will be programming with but Intro to Computational Thinking might not be as rigorous as the alternative and covers a range of things implemented in python (distributions, monte carlo, etc). This would be helpful for the DS student as it'll give more practice in a language they'll definitely encounter. It'll also reinforce topics covered in probability and statistics. On the other, Sedgewick's will give you a very thorough understanding of algorithms specifically and the textbooks are available online, with updates and resources. Learning Java will also be good for anyone that will work at larger companies and be exposed to these types of codebases, so this route would be good for something you 'might' encounter. Personally, I think the Sedgewick combination would be best in the CS curriculum, mainly because it's more aligned with CS than DS in my opinion as I don't think they're as necessary for machine/deep learning. They would be if you were programming the libraries themselves, but that's why I think they're more relevant for CS. Definitely would suggest having in the DS curriculum as an extra though. |
Head first Java might be an excellent option and beginner-friendly |
Regarding the Algorithms section. The OSSU route for CS suggests the Algorithms specialization from Stanford on Coursera: https://www.coursera.org/specializations/algorithms The DS major suggests Algorithms 1 & II from Princeton on Coursera: https://www.coursera.org/learn/algorithms-part1 Would there be value in using the same set of courses to cover algorithms between both programs? |
Yes, there would be. While Discord channels for the Data Science individual courses have not been added yet, they will be in the future. If Data Science and Computer Science students are in the same course, they can be in the same discussion rooms, increasing critical mass for productive peer learning. The natural next question is: Why does the proposal include a different algorithms course for Data Science? Essentially, computer scientists need to know more about complexity and computability than data scientists do. Some CS2013 requirements are:
These match up with the 3rd and 4th Stanford algorithms courses, which teach:
The CGUPDS, by contrast requires:
This is a decent fit for Princeton's Algorithms which teaches:
I think that curricular fit here is an overriding concern, but I'm interested to hear the opposing case. |
I agree with your concern. The default proposed curriculum should cover the material in CGUPDS and not try to cover an inordinate amount of additional material. I think another possibility would be to include courses that overlap in both curricula as appropriate as alternatives. For example, in the DS curriculum list the Stanford Algorithm Specialization as an alternative for fulfilling the requirements of the program of the study and that it would also fulfill the requirements of the DS course with the caveats that the Stanford specialization covers more material and require a larger time commitment. This may help capture benefit you mentioned
If an acceptable alternative is present in the CS program, then listing it would seem to facilitate this goal. |
Have you guys seen The Open Source Data Science Masters website, Siraj Raval - Data Sciente Youtuber Github and Data Science From Scratch? Maybe they have good guidelines and courses options for the new DS Curriculum... I don't know... just giving suggestions... |
I just found this RFC on Friday 8/28, and I haven't yet had a chance to deep-dive it, but in the spirit of commenting before the close date of the RFC, I have a few thoughts:
|
@waciumawanjohi Thank you for your responses here. I should clarify, I was looking at the current V1 curriculum while developing my comments above. I'll take a closer look to see how these ideas are proposed to be implemented in the V2 and comment further. |
Great! It sounds like we're thinking in similar directions. |
Can I start my first course from the V2 Curriculum or should I wait a little longer? |
@EWCunha I would recommend any student that's starting now use V2. |
I have reviewed the proposed CGUPDS curriculum guidelines and the candidate V2. Overall, I like the thrust and structure of the new program and I have no exceptions or recommendations for substantive changes at this time. These are good selections, and seem to fit the curriculum guidelines well. A couple of comments:
|
Close of the Comment PeriodFindings: Response: CGUPDS makes mention of needing a course in another discipline, but gives this recommendation not even a paragraph of support. It strikes me as similar to a recommendation for a balanced liberal arts education. And while I highly value such an education, that's different from the goal of OSSU. OSSU supports the study of particular domains and leaves the rounding out of other domains as an exercise for the learner.* As such, no work in other disciplines is contained in this revision. OSSU should recommend how students can undertake a capstone experience. I don't have an answer for this question at the moment. This is left undone. I hope that contributors can propose and discuss options, either in the Issues here or in the OSSU Discord. Conclusion: |
...In going to add Py4E to V2, I noticed that it is already in the curriculum. oof. |
Problem:
The curriculum has not been maintained and does not represent best practice.
Duration:
2020-08-31
Background:
OSSU recommends courses that would constitute an undergraduate major in Data Science. It is our responsibility to ensure that we follow best practice. To do so, we must bring the curriculum into alignment with external guidelines. A candidate set of guidelines has been identified and previously proposed.
In 2017, the Annual Review of Statistics and Its Application published the report "Curriculum guidelines for undergraduate programs in data science." The report was authored by “25 undergraduate faculty from a variety of institutions in the United States, primarily from the disciplines of mathematics, statistics, and computer science.” It had a goal of providing “structure for institutions planning for or revising a major in data science.”
The current state of OSSU Data Science is one of disrepair. The curriculum has had 1 change in 3 years. That change deleted a link to a broken application. But there remained many links to courses that are no longer offered. A list of these can be found here. Prospective students have posted in the issues asking if the Data Science curriculum is still maintained. Updating the curriculum must ensure that all courses are available for students.
Proposal:
OSSU Data Science should adopt “Curriculum guidelines for undergraduate programs in data science” (CGUPDS) as our guidelines. The curriculum should be updated to match. The exact changes can be reviewed in this pull request.
The text was updated successfully, but these errors were encountered: