没有合适的资源?快使用搜索试试~ 我知道了~
藏经阁-Improving Python and Spark Performance and Interoperability.
1.该资源内容由用户上传,如若侵权请联系客服进行举报
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
2.虚拟产品一经售出概不退款(资源遇到问题,请及时私信上传者)
版权申诉
0 下载量 106 浏览量
2023-08-26
15:25:30
上传
评论
收藏 705KB PDF 举报
温馨提示
试读
37页
藏经阁-Improving Python and Spark Performance and Interoperability
资源推荐
资源详情
资源评论
www.twosigma.com
Improving Python and Spark
Performance and Interoperability
February 9, 2017 All Rights Reserved
Wes McKinney @wesmckinn
Spark Summit East 2017
February 9, 2017
Me
February 9, 2017
• Currently: Software Architect at Two Sigma Investments
• Creator of Python pandas project
• PMC member for Apache Arrow and Apache Parquet
• Other Python projects: Ibis, Feather, statsmodels
• Formerly: Cloudera, DataPad, AQR
• Author of Python for Data Analysis
All Rights Reserved
2
Important Legal Information
The information presented here is offered for informational purposes only and should not be
used for any other purpose (including, without limitation, the making of investment decisions).
Examples provided herein are for illustrative purposes only and are not necessarily based on
actual data. Nothing herein constitutes: an offer to sell or the solicitation of any offer to buy any
security or other interest; tax advice; or investment advice. This presentation shall remain the
property of Two Sigma Investments, LP (“Two Sigma”) and Two Sigma reserves the right to
require the return of this presentation at any time.
Some of the images, logos or other material used herein may be protected by copyright and/or
trademark. If so, such copyrights and/or trademarks are most likely owned by the entity that
created the material and are used purely for identification and comment as fair use under
international copyright and/or trademark laws. Use of such image, copyright or trademark does
not imply any association with such organization (or endorsement of such organization) by Two
Sigma, nor vice versa.
Copyright © 2017 TWO SIGMA INVESTMENTS, LP. All rights reserved
This talk
4 February 9, 2017
• Why some parts of PySpark are “slow”
• Technology that can help make things faster
• Work we have done to make improvements
• Future roadmap
All Rights Reserved
Python and Spark
February 9, 2017
• Spark is implemented in Scala, runs on the Java virtual machine (JVM)
• Spark has Python and R APIs with partial or full coverage for many parts of
the Scala Spark API
• In some Spark tasks, Python is only a scripting front-end.
• This means no interpreted Python code is executed once the Spark
job starts
• Other PySpark jobs suffer performance and interoperability issues that we’re
going to analyze in this talk
All Rights Reserved
5
剩余36页未读,继续阅读
资源评论
weixin_40191861_zj
- 粉丝: 62
- 资源: 1万+
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- 20240420-扬州高中小学部风雨操场转换层条件图r.dwg
- 小猫咪邮件在线发送系统源码v1.1,支持添加附件
- 永磁电机铜耗估算表-Excel-v1.0
- 参考资料-人工智能对劳动力市场的影响机制研究.pdf
- 协同供应链集成产品介绍V71sp1.rar
- 上市公司-人工智能的采纳程度面板数据(2003-2021年).xlsx
- 参考资料-人工智能技术应用对就业的影响及作用机制研究-来自制造业企业的微观证据.pdf
- 第5章spring-mvc请求映射处理
- 2023-04-06-项目笔记 - 第一百十六阶段 - 4.4.2.114全局变量的作用域-114 -2024.04.27
- 协同供应链集成产品介绍V70.rar
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功