Data Science in Production

Data Science in Production
Author :
Publisher :
Total Pages : 234
Release :
ISBN-10 : 165206463X
ISBN-13 : 9781652064633
Rating : 4/5 (633 Downloads)

Book Synopsis Data Science in Production by : Ben Weber

Download or read book Data Science in Production written by Ben Weber and published by . This book was released on 2020 with total page 234 pages. Available in PDF, EPUB and Kindle. Book excerpt: Putting predictive models into production is one of the most direct ways that data scientists can add value to an organization. By learning how to build and deploy scalable model pipelines, data scientists can own more of the model production process and more rapidly deliver data products. This book provides a hands-on approach to scaling up Python code to work in distributed environments in order to build robust pipelines. Readers will learn how to set up machine learning models as web endpoints, serverless functions, and streaming pipelines using multiple cloud environments. It is intended for analytics practitioners with hands-on experience with Python libraries such as Pandas and scikit-learn, and will focus on scaling up prototype models to production. From startups to trillion dollar companies, data science is playing an important role in helping organizations maximize the value of their data. This book helps data scientists to level up their careers by taking ownership of data products with applied examples that demonstrate how to: Translate models developed on a laptop to scalable deployments in the cloud Develop end-to-end systems that automate data science workflows Own a data product from conception to production The accompanying Jupyter notebooks provide examples of scalable pipelines across multiple cloud environments, tools, and libraries (github.com/bgweber/DS_Production). Book Contents Here are the topics covered by Data Science in Production: Chapter 1: Introduction - This chapter will motivate the use of Python and discuss the discipline of applied data science, present the data sets, models, and cloud environments used throughout the book, and provide an overview of automated feature engineering. Chapter 2: Models as Web Endpoints - This chapter shows how to use web endpoints for consuming data and hosting machine learning models as endpoints using the Flask and Gunicorn libraries. We'll start with scikit-learn models and also set up a deep learning endpoint with Keras. Chapter 3: Models as Serverless Functions - This chapter will build upon the previous chapter and show how to set up model endpoints as serverless functions using AWS Lambda and GCP Cloud Functions. Chapter 4: Containers for Reproducible Models - This chapter will show how to use containers for deploying models with Docker. We'll also explore scaling up with ECS and Kubernetes, and building web applications with Plotly Dash. Chapter 5: Workflow Tools for Model Pipelines - This chapter focuses on scheduling automated workflows using Apache Airflow. We'll set up a model that pulls data from BigQuery, applies a model, and saves the results. Chapter 6: PySpark for Batch Modeling - This chapter will introduce readers to PySpark using the community edition of Databricks. We'll build a batch model pipeline that pulls data from a data lake, generates features, applies a model, and stores the results to a No SQL database. Chapter 7: Cloud Dataflow for Batch Modeling - This chapter will introduce the core components of Cloud Dataflow and implement a batch model pipeline for reading data from BigQuery, applying an ML model, and saving the results to Cloud Datastore. Chapter 8: Streaming Model Workflows - This chapter will introduce readers to Kafka and PubSub for streaming messages in a cloud environment. After working through this material, readers will learn how to use these message brokers to create streaming model pipelines with PySpark and Dataflow that provide near real-time predictions. Excerpts of these chapters are available on Medium (@bgweber), and a book sample is available on Leanpub.


Data Science in Production Related Books

Data Science in Production
Language: en
Pages: 234
Authors: Ben Weber
Categories:
Type: BOOK - Published: 2020 - Publisher:

DOWNLOAD EBOOK

Putting predictive models into production is one of the most direct ways that data scientists can add value to an organization. By learning how to build and dep
Machine Learning in Production
Language: en
Pages: 465
Authors: Andrew Kelleher
Categories: Computers
Type: BOOK - Published: 2019-02-27 - Publisher: Addison-Wesley Professional

DOWNLOAD EBOOK

Foundational Hands-On Skills for Succeeding with Real Data Science Projects This pragmatic book introduces both machine learning and data science, bridging gaps
Effective Data Science Infrastructure
Language: en
Pages: 350
Authors: Ville Tuulos
Categories: Computers
Type: BOOK - Published: 2022-08-16 - Publisher: Simon and Schuster

DOWNLOAD EBOOK

Effective Data Science Infrastructure: How to make data scientists more productive is a hands-on guide to assembling infrastructure for data science and machine
Data Science on AWS
Language: en
Pages: 524
Authors: Chris Fregly
Categories: Computers
Type: BOOK - Published: 2021-04-07 - Publisher: "O'Reilly Media, Inc."

DOWNLOAD EBOOK

With this practical book, AI and machine learning practitioners will learn how to successfully build and deploy data science projects on Amazon Web Services. Th
Mastering Java for Data Science
Language: en
Pages: 355
Authors: Alexey Grigorev
Categories: Computers
Type: BOOK - Published: 2017-04-27 - Publisher: Packt Publishing Ltd

DOWNLOAD EBOOK

Use Java to create a diverse range of Data Science applications and bring Data Science into production About This Book An overview of modern Data Science and Ma