Big City, Big Data: Data Science Education at Columbia University

By on

columbia_dsiColumbia University’s Data Science education programs are fairly new, with the Data Science Institute founded in 2012. The program seeks to take an interdisciplinary approach, drawing on more than 200 faculty from the nine schools that make up the university. The classes were intentionally designed especially for these programs, not pulled from existing courses.

Research at the Institute centers on seven areas: foundations of Data Science, cybersecurity, financial and business analytics, health analytics, new media, smart cities, and sensing, moving, and collecting data. In additional to the seven centers, there are a number of working groups focused on applications in material science and the natural sciences.

Programs of Study

 The institute offers online courses through, and two on-campus programs: one, a non-degree certification in Data Science, and the second, a degree program that awards a Master of Science in Data Science.

  • Certification in Data Sciences

The Certification in Data Sciences is a non-degree, part-time program consisting of four three-credit courses; up to about 50 students may be enrolled per year. It’s intended to provide a broad exposure to the foundations of Data Science. There are two computer science courses, algorithms for Data Science and Machine Learning for Data Science, and two statistics courses, probability & statistics and exploratory data analysis & visualization.

Most of the courses are offered in the evenings and the program is typically completed in two semesters with students taking two classes per semester. Admissions are made in the fall only, with the fall courses a prerequisite to the spring courses. Many of the students in the program are full-time, working professionals, often with an advanced degree in another discipline.

  • Master of Science in Data Science

Students in the MS program can study either full time or part time. Approximately 90 students were in the 2015 cohort. Completing the degree requires three computer science courses (algorithms for Data Science, Machine Learning for Data Science, and computer systems for Data Science), three statistics courses (probability, statistical inference & modeling, exploratory data analysis & visualization), a Data Science capstone project, and three electives chosen by the student.

The capstone course requires the student to work on a group project, either selected by faculty or industry-driven. The electives must be taken from Columbia’s graduate-level technical courses in the student’s area of interest. Although research isn’t a mandatory part of the program, students can receive credits for doing research in one of the Institute’s research centers.

Like the certificate program, admission is only offered in the fall. Although there is overlap between the certification courses and degree program courses, students can’t complete the certification program and then apply the courses to the MS; it’s necessary to withdraw from the certification program to switch to the masters.


Students who aren’t interested in studying on campus can explore the courses offered through ColumbiaX. There are several Data Science-related courses online, including Data Science and Analytics in Context, Statistical Thinking for Data Science and Analytics, Machine Learning for Data Science and Analytics, Enabling Technologies for Data Science and Analytics: The Internet of Things, and Big Data in Education. By completing three courses that comprise the Data Science and Analytics XSeries, online students can obtain an edX certificate. This program aims to explore the concepts of Data Science without a heavy degree of programming.

Admissions and Student Body

 The applications process requires transcripts, three letters of recommendation, and a personal statement. The Master of Science program requires the GRE exam, though this requirement can be waived for students who already have an advanced degree. TOEFL is required for foreign students. The GRE and TOEFL exams are not required for the certificate program. Although there are no minimum score requirements, the average GRE quantitative scores are mid 160s, verbal scores mid 150s, and writing between 3.5 and 5.0. TOEFL scores are typically above 90.

Students must have an undergraduate degree and have taken at least one prior quantitative course such as linear algebra; a more quantitative background provides a stronger application. Prospective students should also have had at least an introduction to programming. The courses at the Institute require programming in R and Python, so students should be prepared to pick them up if they don’t already know them.

These requirements don’t mean students are drawn only from mathematics and computer science undergraduate majors; less than one third of students have those degrees, and there are many students with backgrounds in the natural and social sciences. In addition, the faculty takes a holistic perspective when reviewing applications and low test scores can potentially be balanced by other factors.

In the MS program, the average age of students is 27, and some already have another graduate degree or work experience; about half the students are currently working and complete the program part time. The certification students tend to be slightly older, with an average age of 32 and five to ten years of work experience.

Tuition is charged per credit; current costs are $1782 per credit. At that rate, the certification program’s fee for 12 credits is $21,384, and the MS program’s fee for 30 credits is $53,460. Tuition is anticipated to increase 3 to 4 percent annually, so students should expect the cost to increase the longer they take to complete the coursework. The university may charge other fees in addition to tuition. The Data Science Institute doesn’t offer financial assistance; students may apply for financial aid through FAFSA.

Industry and Employment

The Institute has several unique programs to integrate industry affiliates and the academic program, and currently has connections to more than 160 employers. Students gain industry exposure through summer internships and the capstone project. The school also offers students career services, information sessions, and help scheduling interviews. In addition to participating in the Engineering Career Fair, they have a startup fair in the spring, plus special meetings with their industry partners.

The program is also closely involved with New York City’s Applied Sciences Initiative, which helps draw high tech industry to the city. When the Institute was established, the goal was to help generate $4 billion in economic activity, 170 new companies, and 4,500 jobs over 30 years. As such, the Institute’s research centers and academic programs are oriented towards the industries in the metropolitan area.

Student Activities

 In addition to their academic studies, the Institute encourages students in the Data Science Institute to participate in the student life on campus. The Columbia Data Science Society is a student group that arranges a range of activities, including corporate visits and a hackathon. They participate in other public Data Science challenges, offer workshops to delve into specific skills, and bring various speakers to campus.

Future Plans

 Although the MS program doesn’t require research and there is a strong relationship with industry, the Institute expects that students who achieve the MS would be prepared to pursue a PhD if they desire; one student went on for a Doctorate in Epidemiology. They are working to create a Data Science PhD program for a few years from now. Students who complete their studies at Columbia are well prepared to begin their industrial Data Science career or to continue with further studies.


photo credit: Columbia University

Leave a Reply

We use technologies such as cookies to understand how you use our site and to provide a better user experience. This includes personalizing content, using analytics and improving site operations. We may share your information about your use of our site with third parties in accordance with our Privacy Policy. You can change your cookie settings as described here at any time, but parts of our site may not function correctly without them. By continuing to use our site, you agree that we can save cookies on your device, unless you have disabled cookies.
I Accept