Data Scientist is the #1 job in America, according to Glassdoor.

Located in San Francisco, Pacific's MS in Data Science program equips students for the exciting field of data science. This STEM-designated program uses a hybrid approach to learning with most courses requiring attendance in both in-person and online class sessions. It consists of 4 semesters spread over 2 academic years, during which 32 units must be completed for degree conferral. Starting in Fall 2019, each unit costs $1,498. Our students receive a personalized education featuring the small class sizes that are a trademark of a Pacific education.

Data scientists are in high demand. Every industry realizes that data is the key to future success, and organizations are looking for data scientists with deep knowledge of analytics. Pacific's Data Science program will prepare its students to be the data scientists of tomorrow. 

COVID-19: The MS Data Science Program WILL run this fall, as usual. We will have procedures in place to enable those students who cannot get to San Francisco due to travel/visa restrictions, to start their program online until travel restrictions are lifted. We will be following all official guidelines, including social distancing on our campus, to ensure everyone's health and safety.

Are you a Pacific undergraduate student? Consider adding the Minor in Data Science to your diploma. In addition to giving you highly-sought skills in Data Science, this is an excellent way to prepare for our Masters program in Data Science. For more info, visit the Minor in Data Science Webpage or contact the minor advisor, Prof. J. Hetrick.

Class of 2018
Alex Katona

"The program is intelligently designed for working professionals. I could continue to work full-time while taking classes full-time. This is possible because the in-person classes are held on Saturdays and the online classes are held after office hours. For my capstone project, I got to work with Charles Schwab, which helped me apply everything I'd learned in the program—something very few universities offered. I would strongly recommend this program to anyone considering a career in data science or advancing within the field."

Alex Katona at the San Francisco campus.

Select the link below to take the self-assessment test. This will help you determine if you have the necessary pre-requisite skills to start the Data Science program:

MS Data Science Self-Assessment

  • Bachelor's degree in any field
  • Must have coursework or experience in:
    • Linear algebra
    • Statistics-
    • Programming in a high-level language (experience in R and Python preferred)

  • Transcripts (1)
  • Two letters of recommendation
  • Resume
  • Statement of interest (2)

Neither the GRE nor the GMAT is required for admission to this program. 

  • Must have the U.S. equivalent of a 2.65 GPA to be eligible for admission to the program.
  • Must provide an official, course-by-course evaluation of their transcripts with an overall U.S. GPA equivalent from one of the agencies accepted by the University.
  • Must provide official English proficiency test scores from either TOEFL or IELTS.

Note: Course-by-course evalutions (and any other documents that are not directly loaded via GradCAS) should be sent to the following address:

University of the Pacific
Knoles Hall Second Floor, Room 207 B
3601 Pacific Avenue Stockton, CA 95211-0110 

  • An official set of transcripts from each institution you have attended is required as part of the application.
  • The statement of interest allows applicants to demonstrate their motivation, skills, and abilities that will contribute to their academic success in our program. While there is no specific format required for this statement, applicants are advised to give particular consideration to: o
    • Academic credentials
    • Experience in the foundational concepts of:
    • Statistics o Linear Algebra
    • Computer programming (any language, but Python and R are preferred)
    • Commitment and personal stamina to undertake fast paced, intensive academic program
    • Enthusiasm for this particular course of study

Curriculum Overview

University of the Pacific's MS in Data Science program uses a hybrid approach that combines the convenience of online learning with hands-on experience in the classroom. Online sessions are taught on weekday evenings, and classroom sessions are taught on the weekends. All courses are conducted live, with your professors, including the online, interactive sessions. All lectures are recorded so that students can review them later, if necessary. 

The program culminates with the Capstone Project, which gives students the opportunity to apply the knowledge they have gained by working with industry professionals to solve a real-world problem.

 

First Semester: Second Semester: Third Semester: Fourth Semester:
  • Analytic Hot Topics
  • Relational Databases
  • Linear Algebra for Data Science 
  • Research Methods for Data Science
  • Analytics Computing for Data Science
  • Frequentist Statistics
  • Weekly Hot Topics
  • Bayesian Statistics
  • Software Methods for Data Science
  • Machine Learning
  • Advanced Machine Learning
  • Time Series Analysis
  • Data Wrangling
  • Weekly Hot Topics
  • Data Engineering for Data Science
  • Introduction to Visualization
  • NoSQL Databases
  • Customer Analytics
  • Emphasis Case Studies
  • Visual Storytelling
  • Healthcare Case Studies
  • Dynamic Visualization
  • Fraud Detection or Legal Analytics
  • Capstone 

*Courses are subject to change.*

RELATIONAL DATABASES


This course introduces relational database management systems (RDBMS) and the structured query language (SQL) for manipulating data stored therein. The class is focused on the applied use of SQL by data scientists to extract, manipulate and prepare data for analysis. Although this class is not a database design class, students will be exposed to entity-relationship (ER) models and the benefits of third normal form (3NF) data modeling. The class employs hands-on experiential learning utilizing the modern relational database querying languages and graphical development environments.


LINEAR ALGEBRA FOR DATA SCIENCE


Linear algebra is the generalized study of solutions to systems of linear equations. This course will focus on developing a conceptual understanding of computational tools from linear algebra which are frequently employed in the analysis of data. These tools include formulating linear systems as matrix-vector equations, solving systems of simultaneous equations using technology, performing basic computations involving matrix algebra, solving eigenvalue-eigenvector problems using technology, diagonalization, and orthogonal projections. The use of software to perform computations will be emphasized.


ANALYTICS COMPUTING FOR DATA SCIENCE


This course introduces computational data analysis using multi-paradigm programming languages.  By the end of the course, students will tackle complex data analysis problems.  The course emphasizes the use of programming languages for statistical and machine learning analysis, and predictive modeling.   Graphical analytics tools will also be used.  The course will also cover the various packages for accessing data that come with the various languages, manipulating and preparing data for analysis, conducting statistical and machine learning analyses, and graphically plotting and visualizing data and analytical results.  The course emphasizes hands-on data and analysis using a variety of real-world data sets and analytical objectives. 


FREQUENTIST STATISTICS FOR DATA SCIENCE


A survey of regression, linear models, and experimental design. Topics include simple and multiple linear regression, single- and multi-factor studies, analysis of variance, analysis of covariance, model selection, diagnostics. This class focuses more on the application of regression methods than the underlying theory through the use of modern statistical programming languages.


WEEKLY HOT TOPICS


This course consists of a set of weekly presentations and discussions around key analytic issues and current case studies. These hot topics will be presented by a combination of guest speakers-industry luminaries in the area of analytics-and University of the Pacific faculty members, including the MS analytics program director. Many of these topics will be drawn from relevant real-world contemporary analytic stories that reinforce specific elements of the academic content being taught and can not be predicted in advance. Students will also be introduced to key topics around the use of data and the methods and techniques involved in data science. This will include Ethics, Critical Thinking, Communication Skills, Presentations Skills, and Innovation.

 


RESEARCH METHODS


Students learn about research design, qualitative and quantitative research, and sources of data. Topics will include a variety of research topics, including such things as data collection procedures, measurement strategies questionnaire design and content analysis, interviewing techniques, literature surveys; information databases, probability testing, and inferential statistics. Students will prepare and present a research proposal (with emphasis on technical writing/presentation principles) as part of the course.

DATA WRANGLING


This course will teach students how to retrieve data from disparate sources, combine it into a unified format, and prepare it for effective analysis. This aspect of data science is often estimated to be upwards of 80% of the effort in a typical analytics process. Students will learn how to read data from a variety of common storage formats, evaluate its quality, and learn various techniques for data cleansing. Students will also learn how to select appropriate features for analysis, transform them into more usable formats, and engineer new features into more powerful predictors. This class will also teach students how to split the data set into training and validation data for more effective analytical modeling.


BAYESIAN STATISTICS FOR DATA SCIENCE


This course introduces Bayesian statistical methods that enable data analysts and scientists to combine information from similar experiments, account for complex spatial, temporal, and other relationships, and also incorporate prior information or expert knowledge into a statistical analysis. This course explains the theory behind Bayesian methods and their practical applications, such as social network analysis, predicting crime risk, or predicting credit fraud. The course emphasizes data analysis through the use of modern analytic programming languages. 


TIME SERIES ANALYSIS


This course introduces the theory and application of statistical methods for the analysis of data that have been observed over time. Students will learn techniques for working with time series data and how to account for the correlation that may exist between measurements that are separated by time. The class will concentrate on both univariate and multivariate time series analysis, with a balance between theory and applications. Students will complete a time series analysis project using real-world scenario and data set


SOFTWARE METHODS


Students learn the tools, methodology, and etiquette in software development, focusing upon developing data science applications, tools, and analytical workflows in collaborative environments.  Data scientists are at the nexus of software engineering, science, and business.  In order to thrive in this world, they must work collaboratively across these fields and skill sets, while ensuring that work is accessible and digestible to everyone involved.  Moreover, they must ensure their work is production-worthy and extensible.  This course teaches all of the elements, both technical and conceptual, to create productive, helpful, and professional data scientists.  


MACHINE LEARNING


This course introduces the theory and application of machine learning for uncovering patterns and relationships contained in large data sets. Machine learning algorithms offer a complimentary set of analytical techniques to statistical methods. Students will be exposed to the theory underlying supervised and unsupervised learning methods. Practical application of these techniques will be introduced using R. Additionally, students will learn proper techniques for developing, training, and cross validating predictive models; bias versus variance; and will explore the practical usage of these techniques in business and scientific environments.
 


ADVANCED MACHINE LEARNING


This course builds on the fundamentals introduced in ANLT 222 Machine Learning, by studying examining more machine learning algorithms and neural network topologies and studying their respective applications. The course includes an overview of the TensorFlow language, Decision Tree methods, and an introduction to Natural Language Processing (NLP).


WEEKLY HOT TOPICS


This course consists of a set of weekly presentations and discussions around key analytic issues and current case studies. These hot topics will be presented by a combination of guest speakers-industry luminaries in the area of analytics-and University of the Pacific faculty members, including the MS analytics program director. Many of these topics will be drawn from relevant real-world contemporary analytic stories that reinforce specific elements of the academic content being taught and can not be predicted in advance.

DATA ENGINEERING


This course introduces students to data warehousing architectures, big data processing pipelines, and in-memory analytic techniques as an alternative to traditional warehouse approaches.  The class will provide an overview of conventional data warehousing architectures, focusing on those processing pipeline technologies that enable the management of both SQL and NoSQL data.  Students will learn how to design systems to manage large volumes of poly-structured data including temporal, spatial, spatiotemporal, and multidimensional data.  The class will also provide an overview of the benefits of in-memory analytics, focusing on cloud computing and cluster computing architectures and associated modern toolsets.  Students will learn how to design in-memory systems to iterative graphs, complex multistage applications, and fault tolerant solutions, and to use modern cloud based analytic platform services.


INTRODUCTION TO VISUALIZATION


This course introduces tools and methods for visualizing data and communicating information clearly through graphical means. The class covers various data visualizations and how to select the most effective one depending on the nature of the data. Students will work with modern analytic graphics packages, and will be introduced to open source libraries and best-in-class libraries and methods.

The course focuses in part on the technical methods: what actions and commands are necessary to ingest data and visualize it.  The course also focuses equally upon the concepts and thought processes that make for strong visualizations. Examples include a focus on data density; as more data is provided to the user in a consumable fashion, the truth is necessarily more evident.  Another example is line weights and text embedded into the visualization; these techniques help bring the user's eyes to the important parts of the visualization that the student wants to emphasize.   Finally, the course revisits concepts to which the students have been exposed earlier, such as reproducible reports.


CUSTOMER ANALYTICS


This course introduces the techniques used to analyze consumer shopping and buying behavior using transactional data in industries like retail, grocery, e-commerce, and others. Students will learn how to conduct item affinity (market basket) analysis, trip classification analysis, RFM (recency, frequency, monetary) analysis, churn analysis, and others. This class will teach students how to prepare data for these types of analyses, as well as how to use machine learning and statistical methods to build the models. The class is an experiential learning opportunity that utilizes real-world data sets and scenarios.

FRAUD DETECTION

This course introduces the use of analytics to detect fraud in a variety of contexts. This class shows how to use machine learning techniques to detect fraudulent patterns in historical data, and how to predict future occurrences of fraud. Students will learn how to use supervised learning, unsupervised learning, and social network learning for these types of analyses. Students will be introduced to these techniques in the domains of credit card fraud, healthcare fraud, insurance fraud, employee fraud, telecommunications fraud, web click fraud, and others. The course is experiential and will apply concepts taught in prior data wrangling and machine learning courses using real-world data sets and fraud scenarios. 

OR

LEGAL ANALYTICS


This course introduces the topic of law as it applies to data science and the data scientist. The law is inextricably intertwined with data science and in very diverse ways. At a high-level, the law impacts data science in three major ways. One, data science greatly facilitates the practice of law as it does many other domains. Second, compliance laws and regulations determine how data science tasks and projects are undertaken. And, third, the law impacts the data scientist in direct and significant ways as a practicing professional. As with most law-school courses, the learning in this course is facilitated by the application of laws to factual scenarios; students will have the opportunity to express their thoughts and to debate underlying issues. Given this, a lab component to this course is not necessary.


DYNAMIC VISUALIZATION


This course introduces advanced visualization techniques for developing dynamic, interactive, and animated data visualization. Students will learn a variety of techniques for the visualization of complicated data sets. These techniques are valuable for visualizing genomic data, social or other complex networks, healthcare data, business dynamics changing over time, weather and scientific data, and others. Often the visual presentation of data is enhanced when it is made interactive and dynamic, allowing users to "move through" the data and manipulate the data graphically for exploratory analysis. This presentation often involves web application development, and students will be exposed to these rudiments as well as tools that enable faster development of data visualization.



CAPSTONE INDUSTRY-SPONSORED PROJECT (THROUGHOUT FOURTH SEMESTER)

This course is a culmination of all modules in the MSc Data Science program. It provides an experiential learning opportunity that connects all of the materials covered in the MSc Data Science program. Students will be formed into teams (typically of three) and assigned to an industry-sponsored project. Capstone projects will be agreed upon in advance with sponsoring companies and will represent real-world business issues that are amenable to an analytic approach.  These projects will be conducted in close oversight by the sponsoring company, as well as, a University of the Pacific (UOP) faculty member and may be conducted on the sponsoring company's premises using their preferred systems and tools (at the sponsoring company's discretion).

Students will be expected to complete the specific project outcomes defined at the start of the project, including a final presentation to the sponsoring company, their project lead and executive management, as well as Pacific faculty and program director. The presentation will include a clear explanation of the data sources, data cleanliness / deficiencies, analytic techniques used, derived insights and compelling visualizations and recommendations. The final report should also indicate any known deficiencies in the results (e.g. due to missing data) and the degree of confidence their customers should have in the insights and recommendations provided.

Contact Us

Engineering students with Professor
School of Engineering and Computer Science
(209) 946-2992