We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
有一定的提升,非决定性. set("spark.serializer","org.apache.spark.serializer.KryoSerializer")
默认序列化是 java Objectoutputstream Objectinputstream .这种默认序列化好处在于处理方便.但是算子必须实现seriable,即可 缺点,序列化以后的数据慢,空间大.
spark使用kryo序列化机制,比java序列化机制快,更小 大概是十分之一. 让网络传输变少,消耗资源更少. 生效 1.算子函数使用的外部变量. 2.持久化占用更小 3.shuffle