所需积分/C币:10 2019-09-16 04:57:59 14.53MB PDF
收藏 收藏

data visualization with python and javascript Crafting a Data-visualisation Toolchain for the Web Kyran dale Beijing Boston. Farnham. Sebastopol. Tokyo OREILLY Data Visualization with Python and JavaScript y Kyran dal Copyright o 2016 Kyran Dale. All rights reserved Printed in the United states of america Published by O Reilly Media, Inc, 1005 Gravenstein Highway North, Sebastopol, CA95472 OReilly books may be purchased for educational, business, or sales promotional use Onlineeditionsarealsoavailableformosttitles more information, contact our corporate/institutional sales department Editors: Dawn Schanafelt and Meghan Proofreader: FILL IN PROOFREADER Blanchette Indexer FILL IN indeXer Production Editor: FILL IN PRODUC- Interior Designer: David Futato TION EDITOR Cover Designer: Karen Montgomery Copyeditor: FILL IN COPYEDITOR Illustrator: rebecca demarest January -4712 First edition Revision History for the First Edition 2016-02-22: First Early Release 2016-03-21: Second early release See The O Reilly logo is a registered trademark of o reilly media, inc. Data visualiza tion with Python and Java Script, the cover image, and related trade dress are trade marks of o reilly media, Inc While the publisher and the author(s) have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author(s) disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is sub ject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights 978-1-491-95643-4 FILL INI Table of contents Introduction A Development Setup ...... Python 23 Javascript 26 Databases 28 Integrated Development Environments 8 Summary 29 Part. a basic toolkit 2. A Language Learning Bridge Between Python and JavaScript... 33 Similarities and differences 33 Interacting with the Code 35 Basic bridge Work 37 Differences in Practice 62 A Cheatsheet Summary 76 3. Reading and Writing Data with Python.……77 Easy does It 7 Passing Data around 78 Working with System Files CSV. TSV and Row-column Data-formats 0 JSON 83 SQL 86 MongoDB 97 Dealing with Dates, Times and Complex Data 102 ummary 104 4. Webdev 101 105 The big picture 105 Single-page Apps 106 Tooling Up 106 Building a Web-page 111 Chromes Developer tools 119 A Basic Page with Placeholders 122 Scalable Vector Graphics(SVG) 127 Summar 142 Part lI. Getting Your Dat 5. Getting Data off the Web with Python. 145 Getting Web-data with the requests library 145 Getting Data-files with requests 146 USing Python to Consume Data from a Web-API 149 Using Libraries to access Web-APIs 155 Scraping data 160 Summary 173 6. Heavyweight Scraping with Scrapy............. 175 Ing up Scrap 176 Establishing the Targets 177 Targeting HTML with Xpaths 179 A First Scrapy spider Scraping the individual biography pages 189 Chaining Requests and Yielding Data 192 Scrap pip belines 196 Scraping Text and Images with a Pipeline 198 ummary 204 I Table of Contents Introduction This book aims to get you up to speed with what is, in my opinion, the most powerful data-Visualisation stack going: Python and Java Script. You'll learn enough of big libraries like Pandas and D3 to start crafting your own web data-visualisations and refining your own toolchain. Expertise will come with practice but this book presents a shallow learning curve to basic competence If you're reading this in Early Release form Id love to hear any feedback you have. Please post Kyran You'll also find a working copy of the Nobel visualisation the book literally and figurative buildstowardsat dataviz/index. html The bulk of this book tells one of the innumerable tales of data- visualisation, one carefully selected to showcase some powerful Python and JavaScript libraries or tools which together form a tool- chain. This toolchain gathers raw, unrefined data at its start and delivers a rich, engaging web-visualisation at its end. Like all tales of data-visualisation it is a tale of transformation in this case trans forming a basic Wikipedia list of Nobel prize-winners into an inter- active visualisation, bringing the data to life and making exploration of the prize's history easy and fun a primary motivation for writing the book is the belief that, what- ever data you have, whatever story you want to tell with it, the natu ral home for the visualizations you transform it into is the web as a delivery platform it is orders of magnitude more powerful than what came before and this book aims to smooth the passage from desktop or server-based data analysis and processing to getting the fruits of hat labour out on the web But the most ambitious aim of this book is to persuade you that working with these two powerful languages towards the goal of delivering powerful web-visualisations is actually fun and engaging I think many potential data-viz programmers assume there is a big divide, called Web Development, between doing what they would like to do, which is program in Python and JavaScript. Web-dev involves loads of arcane knowledge about markup -languages, style-scripts administration etc and cant be done without tools with strange names like Gulp or Yeoman. I aim to show that these days that big divide can be collapsed to a thin and very permeable membrane allowing you to focus on what you do well, programming stuff (see Figure P-1) with minimal effort, relegating the web-servers to data delivery Perception Reality Lere Be sland ebe igure P-1. Here be web-dev dragons Who this book is for First off, this book is for anyone with a reasonable grasp of Python or JavaScript who wants to explore one of the most exciting areas in the data-processing ecosystem right now, the exploding field of data-visualisation for the web. It's also about addressing some spe- cific pain-points which in my experience are quite common vi Introduction When you get commissioned to write a technical book, chances are your editor will sensibly caution you to think in terms of pain points that your book aims to address. The two key pain points of this book are best illustrated by way of a couple of stories one my own, the other one that has been told to me in various guises by jav aScripters i know Many years ago, as an academic researcher, i came across Python and fell in love. I had been writing some fairly complex simulations in C(++) and Pythons simplicity and power was a breathe of fresh air from all the boilerplate makefiles declarations and definitions and the like. Programming was fun, Python the perfect glue, playing nicely with my C(++) libraries(Python wasnt then and still isnt a speed demon)and doing, with consummate ease, all the stuff that in low level languages is such a pain, e. g file i/o, database access, seri- alisation etc.. I started to write all my graphical user interfaces (GUIs)and visualisations in Python, using wxPython, Py Qt and a whole load of other refreshingly easy toolsets now there s some stuff there that I think is pretty cool but I doubt I'll ever get around to the necessary packaging, version checking and various other hurdles to distribution so no-one else will ever see it At the time there existed what in theory was the perfect universal distribution system for the software Id so lovingly crafted, namely the web-browser. Available on pretty much every computer on earth, with its own built-in, interpreted programming language write once, run everywhere. But everyone knew that a. Python doesnt play in the web-browsers sandpit and b browsers were inca pable of ambitious graphics and visualisations, being pretty much limited to static images and the odd jQuery transformation. Java Script was a toy language tied to a very slow interpreter good for little dom tricks but certainly nothing approaching what I could do on the desktop with python So that route was discounted, out of hand. My visualisations wanted to be on the web but there was no route throug Fast forward a decade or so and thanks to an arms race initiated by Google and their V8 engine, JavaScript is now orders of magnitude faster, in fact it's now an awful lot faster than Python. hTMl has also tidied up its act a bit, in the guise of HTML5. It's a lot nicer to See here for a fairly jaw-dropping comparison Introduction work with, with much less boilerplate. What were loosely followed and distinctly shaky protocols like Scalable Vector Graphics(SVG) have firmed up nicely thanks to powerful visualisation libraries, D3 being preeminent. Modern browsers are obliged to work nicely with SVG and, increasingly, 3D in the form of WebGL and its children such as THREE. js. Those visualisations I was doing in Python are now possible on your local web-browser and the payoff is that, with very little effort, they can be made accessible to every desktop, lap top, smartphone and tablet in the world So why arent Pythonistas flocking to get their data out there in a form they dictate? After all, the alternative to crafting it yourself leaving it to somebody else, something most data-scientists I know would find far from ideal. Well, first theres that term Web develop ment,connoting complicated markup, opaque stylesheets, a whole slew of new tools to learn, IDEs to master. And then theres java Script itself, a strange language, thought of as little more than a toy until recently and having something of the neither fish nor fowl to it I aim to take those pain-points head-on and show that you can craft modern web-visualisations(often single page apps)with a very minimal amount of hTml and CSs boilerplate, allowing you to focus on the programming, and that JavaScript is an easy leap for the Pythonista, having a lot in common. But you dont have to leap, Chapter 2 is a language-bridge, which aims to help Pythonistas and JavaScripters bridge the divide between the languages by highlight ing common elements and providing simple translations The second story is a common one I run into among JavaScript data-visualiers I know. Processing data in JavaScript is far from ideal. There are few heavyweight libraries and although recent func tional enhancements to the language make data- munging much more pleasant, there's still no real data-processing ecosystem to speak of. So there's a distinct asymmetry between the hugely power ful visualisation libraries available, D3 as ever paramount, and the ability to clean and process any data delivered to the browser all of this mandates doing your data-cleaning, processing and exploration in another language or with a toolkit like Tableau and this often devolves into piecemeal forays into vaguely remembered Matlab, the steepish learning curve that is R or a Java library or two Toolkit's like Tableau, although very impressive, are often, in my experience, ultimately frustrating for programmers. There's no way to replicate in a gUI the expressive power of a good, general pur Introduction

试读 127P Data-Visualization-with-Python-and-JavaScript-Scrape-Clean-Explore-Transform-Your-Data.pdf.pdf
立即下载 身份认证后 购VIP低至7折
  • 至尊王者

关注 私信
Data-Visualization-with-Python-and-JavaScript-Scrape-Clean-Explore-Transform-Your-Data.pdf.pdf 10积分/C币 立即下载

试读结束, 可继续阅读

10积分/C币 立即下载