Running test 4 1 === Created temporary folder : /tmp/script_FnRYnLhS8S === Copying script to temporary folder === DONE === Executing Script + curl -XDELETE 'localhost:9200/bank?pretty' + curl -XDELETE 'localhost:9200/shakespeare?pretty' + curl -XDELETE 'localhost:9200/apache-logs-*?pretty' + curl -XDELETE 'localhost:9200/swiss-*?pretty' + mkdir -p /home/mes/input_data + cd /home/mes/input_data + [[ ! -f swisscitiespop.txt ]] + [[ ! -f tomslee_airbnb_switzerland_1451_2017-07-11.csv ]] + set -e + mkdir -p airline Importing swiss cities and population ------------------------------------------------------------------------------- + cd airline + echo 'Importing swiss cities and population' + echo ------------------------------------------------------------------------------- + curl -XPUT 'localhost:9200/swiss-citypop?pretty' -H 'Content-Type: application/json' '-d { "settings": { "number_of_shards" : 15, "number_of_replicas" : 0 } } ' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 100 84 0 0 100 84 0 399 --:--:-- --:--:-- --:--:-- 398{ "acknowledged" : true, "shards_acknowledged" : true, "index" : "swiss-citypop" } 100 173 100 89 100 84 297 281 --:--:-- --:--:-- --:--:-- 297 + echo -e ' input { stdin { } } filter { csv { separator => "," columns => ["country", "city", "accent_city", "region", "population", "latitude", "longitude"] } if ([col1] == "year") { drop { } } } output { elasticsearch { hosts => "http://localhost:9200" index => "swiss-citypop" } #stdout {} } ' + logstash --pipeline.unsafe_shutdown -f /tmp/import_swiss_citypop.conf + cat /home/mes/input_data/swisscitiespop.txt Sending Logstash's logs to /usr/local/lib/logstash-6.0.0-beta2/logs which is now configured via log4j2.properties [2017-09-22T10:44:59,675][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/usr/local/lib/logstash-6.0.0-beta2/modules/fb_apache/configuration"} [2017-09-22T10:44:59,694][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"/usr/local/lib/logstash-6.0.0-beta2/modules/netflow/configuration"} [2017-09-22T10:45:00,125][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified [2017-09-22T10:45:00,774][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600} [2017-09-22T10:45:03,211][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}} [2017-09-22T10:45:03,213][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"} [2017-09-22T10:45:03,307][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"} [2017-09-22T10:45:03,317][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil} [2017-09-22T10:45:03,429][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}} [2017-09-22T10:45:03,451][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200"]} [2017-09-22T10:45:03,479][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>250, :thread=>"#"} [2017-09-22T10:45:03,627][INFO ][logstash.pipeline ] Pipeline started {"pipeline.id"=>"main"} [2017-09-22T10:45:03,665][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]} [2017-09-22T10:45:14,993][INFO ][logstash.pipeline ] Pipeline terminated {"pipeline.id"=>"main"} Importing swiss airbnb offers + echo 'Importing swiss airbnb offers' ------------------------------------------------------------------------------- + echo ------------------------------------------------------------------------------- + curl -XPUT 'localhost:9200/swiss-airbnb?pretty' -H 'Content-Type: application/json' '-d { "settings": { "number_of_shards" : 15, "number_of_replicas" : 0 } } ' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0{ "acknowledged" : true, "shards_acknowledged" : true, "index" : "swiss-airbnb" } 100 172 100 88 100 84 281 269 --:--:-- --:--:-- --:--:-- 281 100 172 100 88 100 84 281 269 --:--:-- --:--:-- --:--:-- 281 + echo -e ' input { stdin { } } filter { csv { separator => "," columns => ["room_id", "survey_id", "host_id", "room_type", "country", "city", "reviews", "overall_satisfaction", "accommodates", "bedrooms", "bathrooms", "price", "minstay", "last_modified", "latitude", "longitude", "location"] } if ([col1] == "year") { drop { } } } output { elasticsearch { hosts => "http://localhost:9200" index => "swiss-airbnb" } #stdout {} } ' + logstash --pipeline.unsafe_shutdown -f /tmp/import_swiss_airbnb.conf + cat /home/mes/input_data/tomslee_airbnb_switzerland_1451_2017-07-11.csv Sending Logstash's logs to /usr/local/lib/logstash-6.0.0-beta2/logs which is now configured via log4j2.properties [2017-09-22T10:45:37,818][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"fb_apache", :directory=>"/usr/local/lib/logstash-6.0.0-beta2/modules/fb_apache/configuration"} [2017-09-22T10:45:37,825][INFO ][logstash.modules.scaffold] Initializing module {:module_name=>"netflow", :directory=>"/usr/local/lib/logstash-6.0.0-beta2/modules/netflow/configuration"} [2017-09-22T10:45:38,443][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified [2017-09-22T10:45:38,980][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600} [2017-09-22T10:45:42,172][INFO ][logstash.outputs.elasticsearch] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}} [2017-09-22T10:45:42,182][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://localhost:9200/, :path=>"/"} [2017-09-22T10:45:42,279][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>"http://localhost:9200/"} [2017-09-22T10:45:42,282][INFO ][logstash.outputs.elasticsearch] Using mapping template from {:path=>nil} [2017-09-22T10:45:42,348][INFO ][logstash.outputs.elasticsearch] Attempting to install template {:manage_template=>{"template"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s"}, "mappings"=>{"_default_"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"string", "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], "properties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "properties"=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type"=>"half_float"}}}}}}}} [2017-09-22T10:45:42,368][INFO ][logstash.outputs.elasticsearch] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["http://localhost:9200"]} [2017-09-22T10:45:42,377][INFO ][logstash.pipeline ] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>5, "pipeline.max_inflight"=>250, :thread=>"#"} [2017-09-22T10:45:42,425][INFO ][logstash.pipeline ] Pipeline started {"pipeline.id"=>"main"} [2017-09-22T10:45:42,473][INFO ][logstash.agent ] Pipelines running {:count=>1, :pipelines=>["main"]} [2017-09-22T10:46:30,011][INFO ][logstash.pipeline ] Pipeline terminated {"pipeline.id"=>"main"} + set +e + set +x Warning: Ignoring non-spark config property: es.nodes.data.only=false 17/09/22 10:46:33 INFO SparkContext: Running Spark version 2.2.0 17/09/22 10:46:33 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/09/22 10:46:33 INFO SparkContext: Submitted application: ESTest_4_1 17/09/22 10:46:33 INFO SecurityManager: Changing view acls to: mes 17/09/22 10:46:33 INFO SecurityManager: Changing modify acls to: mes 17/09/22 10:46:33 INFO SecurityManager: Changing view acls groups to: 17/09/22 10:46:33 INFO SecurityManager: Changing modify acls groups to: 17/09/22 10:46:33 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(mes); groups with view permissions: Set(); users with modify permissions: Set(mes); groups with modify permissions: Set() 17/09/22 10:46:34 INFO Utils: Successfully started service 'sparkDriver' on port 34133. 17/09/22 10:46:34 INFO SparkEnv: Registering MapOutputTracker 17/09/22 10:46:34 INFO SparkEnv: Registering BlockManagerMaster 17/09/22 10:46:34 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 17/09/22 10:46:34 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 17/09/22 10:46:34 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-606e427e-b984-4690-9301-ff7a0caf4a3c 17/09/22 10:46:34 INFO MemoryStore: MemoryStore started with capacity 246.9 MB 17/09/22 10:46:34 INFO SparkEnv: Registering OutputCommitCoordinator 17/09/22 10:46:34 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/09/22 10:46:34 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.10.10:4040 17/09/22 10:46:34 INFO SparkContext: Added file file:/tmp/script_FnRYnLhS8S/4_join_1_swissdata.py at spark://192.168.10.10:34133/files/4_join_1_swissdata.py with timestamp 1506077194975 17/09/22 10:46:34 INFO Utils: Copying /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py to /tmp/spark-5c8f0db8-c88b-41ce-be0e-38720b5a9c96/userFiles-6130d366-6f61-4d98-98cf-1acc208db52f/4_join_1_swissdata.py 2017-09-22 10:46:35,400:2890(0x7f814830f700):ZOO_INFO@log_env@726: Client environment:zookeeper.version=zookeeper C client 3.4.8 2017-09-22 10:46:35,400:2890(0x7f814830f700):ZOO_INFO@log_env@730: Client environment:host.name=mes_master 2017-09-22 10:46:35,400:2890(0x7f814830f700):ZOO_INFO@log_env@737: Client environment:os.name=Linux 2017-09-22 10:46:35,400:2890(0x7f814830f700):ZOO_INFO@log_env@738: Client environment:os.arch=4.9.0-3-amd64 2017-09-22 10:46:35,400:2890(0x7f814830f700):ZOO_INFO@log_env@739: Client environment:os.version=#1 SMP Debian 4.9.30-2+deb9u3 (2017-08-06) 2017-09-22 10:46:35,400:2890(0x7f814830f700):ZOO_INFO@log_env@747: Client environment:user.name=mes 2017-09-22 10:46:35,400:2890(0x7f814830f700):ZOO_INFO@log_env@755: Client environment:user.home=/home/mes 2017-09-22 10:46:35,400:2890(0x7f814830f700):ZOO_INFO@log_env@767: Client environment:user.dir=/tmp/script_FnRYnLhS8S 2017-09-22 10:46:35,400:2890(0x7f814830f700):ZOO_INFO@zookeeper_init@800: Initiating client connection, host=192.168.10.10:2181 sessionTimeout=10000 watcher=0x7f8151e6b712 sessionId=0 sessionPasswd= context=0x7f8154006468 flags=0 I0922 10:46:35.402256 2987 sched.cpp:232] Version: 1.3.0 2017-09-22 10:46:35,404:2890(0x7f8144a07700):ZOO_INFO@check_events@1728: initiated connection to server [192.168.10.10:2181] 2017-09-22 10:46:35,410:2890(0x7f8144a07700):ZOO_INFO@check_events@1775: session establishment complete on server [192.168.10.10:2181], sessionId=0x15ea92e2ecd0008, negotiated timeout=10000 I0922 10:46:35.411494 2983 group.cpp:340] Group process (zookeeper-group(1)@192.168.10.10:42003) connected to ZooKeeper I0922 10:46:35.411670 2983 group.cpp:830] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0922 10:46:35.411711 2983 group.cpp:418] Trying to create path '/mesos' in ZooKeeper I0922 10:46:35.421579 2979 detector.cpp:152] Detected a new leader: (id='25') I0922 10:46:35.422195 2979 group.cpp:699] Trying to get '/mesos/json.info_0000000025' in ZooKeeper I0922 10:46:35.425822 2979 zookeeper.cpp:262] A new leading master (UPID=master@192.168.10.10:5050) is detected I0922 10:46:35.426321 2979 sched.cpp:336] New master detected at master@192.168.10.10:5050 I0922 10:46:35.429581 2979 sched.cpp:352] No credentials provided. Attempting to register without authentication I0922 10:46:35.438491 2981 sched.cpp:759] Framework registered with b192e864-8a9b-4ffc-94ab-953d2b929bd2-0001 17/09/22 10:46:35 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 46475. 17/09/22 10:46:35 INFO NettyBlockTransferService: Server created on 192.168.10.10:46475 17/09/22 10:46:35 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 17/09/22 10:46:35 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.10.10, 46475, None) 17/09/22 10:46:35 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.10.10:46475 with 246.9 MB RAM, BlockManagerId(driver, 192.168.10.10, 46475, None) 17/09/22 10:46:35 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.10.10, 46475, None) 17/09/22 10:46:35 INFO BlockManager: external shuffle service port = 7337 17/09/22 10:46:35 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.10.10, 46475, None) 17/09/22 10:46:35 INFO EventLoggingListener: Logging events to file:/var/lib/spark/eventlog/b192e864-8a9b-4ffc-94ab-953d2b929bd2-0001 17/09/22 10:46:35 INFO Utils: Using initial executors = 0, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances 17/09/22 10:46:35 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 0 17/09/22 10:46:35 INFO MesosCoarseGrainedSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 17/09/22 10:46:36 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/tmp/script_FnRYnLhS8S/spark-warehouse'). 17/09/22 10:46:36 INFO SharedState: Warehouse path is 'file:/tmp/script_FnRYnLhS8S/spark-warehouse'. 17/09/22 10:46:37 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 17/09/22 10:46:37 INFO Version: Elasticsearch Hadoop v6.0.0-beta2 [66f16fdd93] 17/09/22 10:46:41 INFO CodeGenerator: Code generated in 220.075879 ms 17/09/22 10:46:41 INFO CodeGenerator: Code generated in 22.129152 ms 17/09/22 10:46:41 INFO CodeGenerator: Code generated in 14.995392 ms 17/09/22 10:46:41 INFO ScalaEsRowRDD: Reading from [swiss-airbnb] 17/09/22 10:46:41 INFO CodeGenerator: Code generated in 25.435905 ms 17/09/22 10:46:42 INFO ScalaEsRowRDD: Reading from [swiss-citypop] 17/09/22 10:46:42 INFO SparkContext: Starting job: collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55 17/09/22 10:46:42 INFO DAGScheduler: Registering RDD 5 (collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) 17/09/22 10:46:42 INFO DAGScheduler: Registering RDD 9 (collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) 17/09/22 10:46:42 INFO DAGScheduler: Got job 0 (collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) with 12 output partitions 17/09/22 10:46:42 INFO DAGScheduler: Final stage: ResultStage 2 (collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) 17/09/22 10:46:42 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 0, ShuffleMapStage 1) 17/09/22 10:46:42 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 0, ShuffleMapStage 1) 17/09/22 10:46:42 INFO DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[5] at collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55), which has no missing parents 17/09/22 10:46:42 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 12.8 KB, free 246.9 MB) 17/09/22 10:46:42 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.8 KB, free 246.9 MB) 17/09/22 10:46:42 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.10.10:46475 (size: 5.8 KB, free: 246.9 MB) 17/09/22 10:46:42 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006 17/09/22 10:46:42 INFO DAGScheduler: Submitting 15 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[5] at collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 17/09/22 10:46:42 INFO TaskSchedulerImpl: Adding task set 0.0 with 15 tasks 17/09/22 10:46:42 INFO DAGScheduler: Submitting ShuffleMapStage 1 (MapPartitionsRDD[9] at collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55), which has no missing parents 17/09/22 10:46:42 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 10.5 KB, free 246.9 MB) 17/09/22 10:46:42 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 5.2 KB, free 246.9 MB) 17/09/22 10:46:42 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.10.10:46475 (size: 5.2 KB, free: 246.9 MB) 17/09/22 10:46:42 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006 17/09/22 10:46:42 INFO DAGScheduler: Submitting 15 missing tasks from ShuffleMapStage 1 (MapPartitionsRDD[9] at collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)) 17/09/22 10:46:42 INFO TaskSchedulerImpl: Adding task set 1.0 with 15 tasks 17/09/22 10:46:43 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 1 17/09/22 10:46:43 INFO ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 1) 17/09/22 10:46:44 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 3 17/09/22 10:46:44 INFO ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 3) 17/09/22 10:46:45 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 7 17/09/22 10:46:45 INFO ExecutorAllocationManager: Requesting 4 new executors because tasks are backlogged (new desired total will be 7) 17/09/22 10:46:46 WARN MesosCoarseGrainedSchedulerBackend: Unable to parse into a key:value label for the task. 17/09/22 10:46:46 WARN MesosCoarseGrainedSchedulerBackend: Unable to parse into a key:value label for the task. 17/09/22 10:46:46 WARN MesosCoarseGrainedSchedulerBackend: Unable to parse into a key:value label for the task. 17/09/22 10:46:46 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 0 is now TASK_RUNNING 17/09/22 10:46:46 INFO TransportClientFactory: Successfully created connection to /192.168.10.12:7337 after 35 ms (0 ms spent in bootstraps) 17/09/22 10:46:46 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 2 is now TASK_RUNNING 17/09/22 10:46:46 INFO TransportClientFactory: Successfully created connection to /192.168.10.10:7337 after 8 ms (0 ms spent in bootstraps) 17/09/22 10:46:46 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_RUNNING 17/09/22 10:46:46 INFO TransportClientFactory: Successfully created connection to /192.168.10.11:7337 after 24 ms (0 ms spent in bootstraps) 17/09/22 10:46:46 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 15 17/09/22 10:46:46 INFO ExecutorAllocationManager: Requesting 8 new executors because tasks are backlogged (new desired total will be 15) 17/09/22 10:46:46 INFO MesosExternalShuffleClient: Successfully registered app b192e864-8a9b-4ffc-94ab-953d2b929bd2-0001 with external shuffle service. 17/09/22 10:46:46 INFO MesosExternalShuffleClient: Successfully registered app b192e864-8a9b-4ffc-94ab-953d2b929bd2-0001 with external shuffle service. 17/09/22 10:46:46 INFO ContextCleaner: Cleaned accumulator 2 17/09/22 10:46:46 INFO ContextCleaner: Cleaned accumulator 1 17/09/22 10:46:46 INFO MesosExternalShuffleClient: Successfully registered app b192e864-8a9b-4ffc-94ab-953d2b929bd2-0001 with external shuffle service. 17/09/22 10:46:47 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 30 17/09/22 10:46:47 INFO ExecutorAllocationManager: Requesting 15 new executors because tasks are backlogged (new desired total will be 30) 17/09/22 10:46:50 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.10.12:33646) with ID 0 17/09/22 10:46:51 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 0, 192.168.10.12, executor 0, partition 1, NODE_LOCAL, 9018 bytes) 17/09/22 10:46:51 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 1, 192.168.10.12, executor 0, partition 3, NODE_LOCAL, 9018 bytes) 17/09/22 10:46:51 INFO ExecutorAllocationManager: New executor 0 has registered (new total is 1) 17/09/22 10:46:51 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.10.11:49822) with ID 1 17/09/22 10:46:51 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 2, 192.168.10.11, executor 1, partition 0, NODE_LOCAL, 9018 bytes) 17/09/22 10:46:51 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 3, 192.168.10.11, executor 1, partition 2, NODE_LOCAL, 9018 bytes) 17/09/22 10:46:51 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.10.12:33829 with 366.3 MB RAM, BlockManagerId(0, 192.168.10.12, 33829, None) 17/09/22 10:46:51 INFO ExecutorAllocationManager: New executor 1 has registered (new total is 2) 17/09/22 10:46:51 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.10.10:56504) with ID 2 17/09/22 10:46:51 INFO TaskSetManager: Starting task 6.0 in stage 0.0 (TID 4, 192.168.10.10, executor 2, partition 6, NODE_LOCAL, 9018 bytes) 17/09/22 10:46:51 INFO TaskSetManager: Starting task 9.0 in stage 0.0 (TID 5, 192.168.10.10, executor 2, partition 9, NODE_LOCAL, 9018 bytes) 17/09/22 10:46:51 INFO ExecutorAllocationManager: New executor 2 has registered (new total is 3) 17/09/22 10:46:51 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.10.11:44227 with 366.3 MB RAM, BlockManagerId(1, 192.168.10.11, 44227, None) 17/09/22 10:46:51 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.10.10:36679 with 366.3 MB RAM, BlockManagerId(2, 192.168.10.10, 36679, None) 17/09/22 10:46:52 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.10.12:33829 (size: 5.8 KB, free: 366.3 MB) 17/09/22 10:46:52 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.10.11:44227 (size: 5.8 KB, free: 366.3 MB) 17/09/22 10:46:53 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.10.10:36679 (size: 5.8 KB, free: 366.3 MB) 2017-09-22 10:46:58,789:2890(0x7f8144a07700):ZOO_WARN@zookeeper_interest@1570: Exceeded deadline by 11ms 17/09/22 10:46:59 INFO TaskSetManager: Starting task 4.0 in stage 0.0 (TID 6, 192.168.10.12, executor 0, partition 4, NODE_LOCAL, 9018 bytes) 17/09/22 10:46:59 INFO TaskSetManager: Starting task 8.0 in stage 0.0 (TID 7, 192.168.10.12, executor 0, partition 8, NODE_LOCAL, 9018 bytes) 17/09/22 10:46:59 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 0) in 8604 ms on 192.168.10.12 (executor 0) (1/15) 17/09/22 10:46:59 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 1) in 8578 ms on 192.168.10.12 (executor 0) (2/15) 17/09/22 10:47:00 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 28 17/09/22 10:47:00 INFO TaskSetManager: Starting task 11.0 in stage 0.0 (TID 8, 192.168.10.12, executor 0, partition 11, NODE_LOCAL, 9018 bytes) 17/09/22 10:47:00 INFO TaskSetManager: Finished task 8.0 in stage 0.0 (TID 7) in 1589 ms on 192.168.10.12 (executor 0) (3/15) 17/09/22 10:47:01 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 27 17/09/22 10:47:01 INFO TaskSetManager: Starting task 5.0 in stage 0.0 (TID 9, 192.168.10.11, executor 1, partition 5, NODE_LOCAL, 9018 bytes) 17/09/22 10:47:01 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 2) in 10183 ms on 192.168.10.11 (executor 1) (4/15) 17/09/22 10:47:01 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 26 17/09/22 10:47:01 INFO TaskSetManager: Starting task 7.0 in stage 0.0 (TID 10, 192.168.10.11, executor 1, partition 7, NODE_LOCAL, 9018 bytes) 17/09/22 10:47:01 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 3) in 10391 ms on 192.168.10.11 (executor 1) (5/15) 17/09/22 10:47:01 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 25 17/09/22 10:47:02 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 11, 192.168.10.12, executor 0, partition 1, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:02 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 6) in 2760 ms on 192.168.10.12 (executor 0) (6/15) 17/09/22 10:47:02 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 24 17/09/22 10:47:02 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.10.12:33829 (size: 5.2 KB, free: 366.3 MB) 17/09/22 10:47:02 INFO TaskSetManager: Starting task 10.0 in stage 0.0 (TID 12, 192.168.10.10, executor 2, partition 10, NODE_LOCAL, 9018 bytes) 17/09/22 10:47:02 INFO TaskSetManager: Finished task 9.0 in stage 0.0 (TID 5) in 11107 ms on 192.168.10.10 (executor 2) (7/15) 17/09/22 10:47:02 INFO TaskSetManager: Starting task 13.0 in stage 0.0 (TID 13, 192.168.10.10, executor 2, partition 13, NODE_LOCAL, 9018 bytes) 17/09/22 10:47:02 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 4) in 11147 ms on 192.168.10.10 (executor 2) (8/15) 17/09/22 10:47:02 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 22 17/09/22 10:47:02 INFO TaskSetManager: Starting task 9.0 in stage 1.0 (TID 14, 192.168.10.12, executor 0, partition 9, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:02 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 11) in 625 ms on 192.168.10.12 (executor 0) (1/15) 17/09/22 10:47:02 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 21 17/09/22 10:47:02 INFO TaskSetManager: Starting task 10.0 in stage 1.0 (TID 15, 192.168.10.12, executor 0, partition 10, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:02 INFO TaskSetManager: Finished task 9.0 in stage 1.0 (TID 14) in 343 ms on 192.168.10.12 (executor 0) (2/15) 17/09/22 10:47:03 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 20 17/09/22 10:47:03 INFO TaskSetManager: Starting task 12.0 in stage 1.0 (TID 16, 192.168.10.12, executor 0, partition 12, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:03 INFO TaskSetManager: Finished task 11.0 in stage 0.0 (TID 8) in 2334 ms on 192.168.10.12 (executor 0) (9/15) 17/09/22 10:47:03 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 19 17/09/22 10:47:03 INFO TaskSetManager: Starting task 13.0 in stage 1.0 (TID 17, 192.168.10.12, executor 0, partition 13, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:03 INFO TaskSetManager: Finished task 10.0 in stage 1.0 (TID 15) in 454 ms on 192.168.10.12 (executor 0) (3/15) 17/09/22 10:47:03 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 18 17/09/22 10:47:03 INFO TaskSetManager: Finished task 12.0 in stage 1.0 (TID 16) in 473 ms on 192.168.10.12 (executor 0) (4/15) 17/09/22 10:47:03 INFO TaskSetManager: Finished task 13.0 in stage 1.0 (TID 17) in 312 ms on 192.168.10.12 (executor 0) (5/15) 17/09/22 10:47:03 INFO TaskSetManager: Starting task 12.0 in stage 0.0 (TID 18, 192.168.10.11, executor 1, partition 12, NODE_LOCAL, 9018 bytes) 17/09/22 10:47:03 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 9) in 2496 ms on 192.168.10.11 (executor 1) (10/15) 17/09/22 10:47:03 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 16 17/09/22 10:47:03 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 19, 192.168.10.11, executor 1, partition 0, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:03 INFO TaskSetManager: Finished task 7.0 in stage 0.0 (TID 10) in 2320 ms on 192.168.10.11 (executor 1) (11/15) 17/09/22 10:47:03 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 15 17/09/22 10:47:03 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 14 17/09/22 10:47:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.10.11:44227 (size: 5.2 KB, free: 366.3 MB) 17/09/22 10:47:04 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 20, 192.168.10.11, executor 1, partition 2, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:04 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 19) in 697 ms on 192.168.10.11 (executor 1) (6/15) 17/09/22 10:47:04 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 13 17/09/22 10:47:04 INFO TaskSetManager: Starting task 14.0 in stage 0.0 (TID 21, 192.168.10.10, executor 2, partition 14, NODE_LOCAL, 9018 bytes) 17/09/22 10:47:04 INFO TaskSetManager: Finished task 10.0 in stage 0.0 (TID 12) in 2159 ms on 192.168.10.10 (executor 2) (12/15) 17/09/22 10:47:04 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 12 17/09/22 10:47:04 INFO TaskSetManager: Starting task 5.0 in stage 1.0 (TID 22, 192.168.10.11, executor 1, partition 5, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:04 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 20) in 261 ms on 192.168.10.11 (executor 1) (7/15) 17/09/22 10:47:04 INFO TaskSetManager: Finished task 13.0 in stage 0.0 (TID 13) in 2353 ms on 192.168.10.10 (executor 2) (13/15) 17/09/22 10:47:04 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 23, 192.168.10.10, executor 2, partition 3, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:04 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 10 17/09/22 10:47:04 INFO TaskSetManager: Starting task 6.0 in stage 1.0 (TID 24, 192.168.10.11, executor 1, partition 6, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:04 INFO TaskSetManager: Finished task 12.0 in stage 0.0 (TID 18) in 1194 ms on 192.168.10.11 (executor 1) (14/15) 17/09/22 10:47:04 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 9 17/09/22 10:47:04 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.10.10:36679 (size: 5.2 KB, free: 366.3 MB) 17/09/22 10:47:05 INFO TaskSetManager: Starting task 7.0 in stage 1.0 (TID 25, 192.168.10.11, executor 1, partition 7, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:05 INFO TaskSetManager: Finished task 5.0 in stage 1.0 (TID 22) in 315 ms on 192.168.10.11 (executor 1) (8/15) 17/09/22 10:47:05 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 8 17/09/22 10:47:05 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 24) in 326 ms on 192.168.10.11 (executor 1) (9/15) 17/09/22 10:47:05 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 7 17/09/22 10:47:05 INFO TaskSetManager: Finished task 7.0 in stage 1.0 (TID 25) in 237 ms on 192.168.10.11 (executor 1) (10/15) 17/09/22 10:47:05 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 6 17/09/22 10:47:05 INFO TaskSetManager: Starting task 4.0 in stage 1.0 (TID 26, 192.168.10.10, executor 2, partition 4, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:05 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 23) in 604 ms on 192.168.10.10 (executor 2) (11/15) 17/09/22 10:47:05 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 5 17/09/22 10:47:05 INFO TaskSetManager: Starting task 8.0 in stage 1.0 (TID 27, 192.168.10.10, executor 2, partition 8, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:05 INFO TaskSetManager: Finished task 4.0 in stage 1.0 (TID 26) in 207 ms on 192.168.10.10 (executor 2) (12/15) 17/09/22 10:47:05 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 4 17/09/22 10:47:05 INFO TaskSetManager: Starting task 11.0 in stage 1.0 (TID 28, 192.168.10.10, executor 2, partition 11, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:05 INFO TaskSetManager: Finished task 8.0 in stage 1.0 (TID 27) in 266 ms on 192.168.10.10 (executor 2) (13/15) 17/09/22 10:47:05 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 3 17/09/22 10:47:06 INFO TaskSetManager: Starting task 14.0 in stage 1.0 (TID 29, 192.168.10.10, executor 2, partition 14, NODE_LOCAL, 8730 bytes) 17/09/22 10:47:06 INFO TaskSetManager: Finished task 14.0 in stage 0.0 (TID 21) in 1423 ms on 192.168.10.10 (executor 2) (15/15) 17/09/22 10:47:06 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 17/09/22 10:47:06 INFO DAGScheduler: ShuffleMapStage 0 (collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) finished in 23.230 s 17/09/22 10:47:06 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 2 17/09/22 10:47:06 INFO DAGScheduler: looking for newly runnable stages 17/09/22 10:47:06 INFO DAGScheduler: running: Set(ShuffleMapStage 1) 17/09/22 10:47:06 INFO DAGScheduler: waiting: Set(ResultStage 2) 17/09/22 10:47:06 INFO DAGScheduler: failed: Set() 17/09/22 10:47:06 INFO TaskSetManager: Finished task 11.0 in stage 1.0 (TID 28) in 261 ms on 192.168.10.10 (executor 2) (14/15) 17/09/22 10:47:06 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 1 17/09/22 10:47:06 INFO TaskSetManager: Finished task 14.0 in stage 1.0 (TID 29) in 204 ms on 192.168.10.10 (executor 2) (15/15) 17/09/22 10:47:06 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 17/09/22 10:47:06 INFO DAGScheduler: ShuffleMapStage 1 (collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) finished in 23.419 s 17/09/22 10:47:06 INFO DAGScheduler: looking for newly runnable stages 17/09/22 10:47:06 INFO DAGScheduler: running: Set() 17/09/22 10:47:06 INFO DAGScheduler: waiting: Set(ResultStage 2) 17/09/22 10:47:06 INFO DAGScheduler: failed: Set() 17/09/22 10:47:06 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[14] at collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55), which has no missing parents 17/09/22 10:47:06 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 26.6 KB, free 246.8 MB) 17/09/22 10:47:06 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 9.9 KB, free 246.8 MB) 17/09/22 10:47:06 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.10.10:46475 (size: 9.9 KB, free: 246.9 MB) 17/09/22 10:47:06 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006 17/09/22 10:47:06 INFO DAGScheduler: Submitting 12 missing tasks from ResultStage 2 (MapPartitionsRDD[14] at collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)) 17/09/22 10:47:06 INFO TaskSchedulerImpl: Adding task set 2.0 with 12 tasks 17/09/22 10:47:06 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 30, 192.168.10.10, executor 2, partition 0, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:06 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 31, 192.168.10.11, executor 1, partition 1, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:06 INFO TaskSetManager: Starting task 2.0 in stage 2.0 (TID 32, 192.168.10.12, executor 0, partition 2, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:06 INFO TaskSetManager: Starting task 3.0 in stage 2.0 (TID 33, 192.168.10.10, executor 2, partition 3, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:06 INFO TaskSetManager: Starting task 4.0 in stage 2.0 (TID 34, 192.168.10.11, executor 1, partition 4, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:06 INFO TaskSetManager: Starting task 5.0 in stage 2.0 (TID 35, 192.168.10.12, executor 0, partition 5, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:06 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.10.10:36679 (size: 9.9 KB, free: 366.3 MB) 17/09/22 10:47:06 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.10.12:33829 (size: 9.9 KB, free: 366.3 MB) 17/09/22 10:47:06 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.10.11:44227 (size: 9.9 KB, free: 366.3 MB) 17/09/22 10:47:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 192.168.10.12:33646 17/09/22 10:47:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 192.168.10.10:56504 17/09/22 10:47:06 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to 192.168.10.11:49822 17/09/22 10:47:06 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 325 bytes 17/09/22 10:47:07 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 4 17/09/22 10:47:07 INFO ExecutorAllocationManager: Requesting 3 new executors because tasks are backlogged (new desired total will be 4) 17/09/22 10:47:07 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 192.168.10.11:49822 17/09/22 10:47:07 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 1 is 318 bytes 17/09/22 10:47:07 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 192.168.10.12:33646 17/09/22 10:47:07 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 1 to 192.168.10.10:56504 17/09/22 10:47:08 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 5 17/09/22 10:47:08 INFO ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 5) 2017-09-22 10:47:08,811:2890(0x7f8144a07700):ZOO_WARN@zookeeper_interest@1570: Exceeded deadline by 12ms 17/09/22 10:47:08 INFO TaskSetManager: Starting task 6.0 in stage 2.0 (TID 36, 192.168.10.11, executor 1, partition 6, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:08 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 31) in 2558 ms on 192.168.10.11 (executor 1) (1/12) 17/09/22 10:47:08 INFO TaskSetManager: Starting task 7.0 in stage 2.0 (TID 37, 192.168.10.11, executor 1, partition 7, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:08 INFO TaskSetManager: Finished task 4.0 in stage 2.0 (TID 34) in 2648 ms on 192.168.10.11 (executor 1) (2/12) 17/09/22 10:47:09 INFO TaskSetManager: Starting task 8.0 in stage 2.0 (TID 38, 192.168.10.11, executor 1, partition 8, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:09 INFO TaskSetManager: Finished task 6.0 in stage 2.0 (TID 36) in 432 ms on 192.168.10.11 (executor 1) (3/12) 17/09/22 10:47:09 INFO TaskSetManager: Starting task 9.0 in stage 2.0 (TID 39, 192.168.10.11, executor 1, partition 9, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:09 INFO TaskSetManager: Finished task 7.0 in stage 2.0 (TID 37) in 348 ms on 192.168.10.11 (executor 1) (4/12) 17/09/22 10:47:09 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 7 17/09/22 10:47:09 INFO ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 7) 17/09/22 10:47:09 INFO TaskSetManager: Starting task 10.0 in stage 2.0 (TID 40, 192.168.10.12, executor 0, partition 10, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:09 INFO TaskSetManager: Finished task 2.0 in stage 2.0 (TID 32) in 3282 ms on 192.168.10.12 (executor 0) (5/12) 17/09/22 10:47:09 INFO TaskSetManager: Starting task 11.0 in stage 2.0 (TID 41, 192.168.10.10, executor 2, partition 11, NODE_LOCAL, 5076 bytes) 17/09/22 10:47:09 INFO TaskSetManager: Finished task 3.0 in stage 2.0 (TID 33) in 3369 ms on 192.168.10.10 (executor 2) (6/12) 17/09/22 10:47:09 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 6 17/09/22 10:47:09 INFO TaskSetManager: Finished task 5.0 in stage 2.0 (TID 35) in 3424 ms on 192.168.10.12 (executor 0) (7/12) 17/09/22 10:47:09 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 30) in 3481 ms on 192.168.10.10 (executor 2) (8/12) 17/09/22 10:47:09 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 4 17/09/22 10:47:09 INFO TaskSetManager: Finished task 8.0 in stage 2.0 (TID 38) in 556 ms on 192.168.10.11 (executor 1) (9/12) 17/09/22 10:47:09 INFO TaskSetManager: Finished task 9.0 in stage 2.0 (TID 39) in 604 ms on 192.168.10.11 (executor 1) (10/12) 17/09/22 10:47:09 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 2 17/09/22 10:47:09 INFO TaskSetManager: Finished task 10.0 in stage 2.0 (TID 40) in 364 ms on 192.168.10.12 (executor 0) (11/12) 17/09/22 10:47:10 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 1 17/09/22 10:47:10 INFO TaskSetManager: Finished task 11.0 in stage 2.0 (TID 41) in 400 ms on 192.168.10.10 (executor 2) (12/12) 17/09/22 10:47:10 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 17/09/22 10:47:10 INFO DAGScheduler: ResultStage 2 (collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55) finished in 3.779 s 17/09/22 10:47:10 INFO DAGScheduler: Job 0 finished: collect at /tmp/script_FnRYnLhS8S/4_join_1_swissdata.py:55, took 27.790924 s 17/09/22 10:47:10 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 0 Printing 10 first results Row(room_id=u'7284708', country=u'Switzerland', city=u'Aeugst am Albis', room_type=u'Private room', bedrooms=u'1.0', bathrooms=None, price=u'42.0', reviews=u'0', overall_satisfaction=u'0.0', latitude=u'47.281159', longitude=u'8.478718', latitude=None, longitude=None, population=None, region=None) Row(room_id=u'1853391', country=u'Switzerland', city=u'Affoltern am Albis', room_type=u'Private room', bedrooms=u'1.0', bathrooms=None, price=u'102.0', reviews=u'11', overall_satisfaction=u'4.5', latitude=u'47.277891', longitude=u'8.449805', latitude=u'47.281224', longitude=u'8.45346', population=None, region=u'01') Row(room_id=u'15761219', country=u'Switzerland', city=u'Affoltern am Albis', room_type=u'Entire home/apt', bedrooms=u'1.0', bathrooms=None, price=u'75.0', reviews=u'1', overall_satisfaction=u'0.0', latitude=u'47.272614', longitude=u'8.449907', latitude=u'47.281224', longitude=u'8.45346', population=None, region=u'01') Row(room_id=u'18419767', country=u'Switzerland', city=u'Affoltern am Albis', room_type=u'Entire home/apt', bedrooms=u'1.0', bathrooms=None, price=u'48.0', reviews=u'3', overall_satisfaction=u'4.0', latitude=u'47.27947', longitude=u'8.450784', latitude=u'47.281224', longitude=u'8.45346', population=None, region=u'01') Row(room_id=u'19026774', country=u'Switzerland', city=u'Affoltern am Albis', room_type=u'Entire home/apt', bedrooms=u'4.0', bathrooms=None, price=u'193.0', reviews=u'0', overall_satisfaction=u'0.0', latitude=u'47.284338', longitude=u'8.453748', latitude=u'47.281224', longitude=u'8.45346', population=None, region=u'01') Row(room_id=u'6895426', country=u'Switzerland', city=u'Affoltern am Albis', room_type=u'Private room', bedrooms=u'1.0', bathrooms=None, price=u'59.0', reviews=u'16', overall_satisfaction=u'4.5', latitude=u'47.27642', longitude=u'8.441259', latitude=u'47.281224', longitude=u'8.45346', population=None, region=u'01') Row(room_id=u'17666068', country=u'Switzerland', city=u'Affoltern am Albis', room_type=u'Entire home/apt', bedrooms=u'2.0', bathrooms=None, price=u'129.0', reviews=u'2', overall_satisfaction=u'0.0', latitude=u'47.290323', longitude=u'8.442326', latitude=u'47.281224', longitude=u'8.45346', population=None, region=u'01') Row(room_id=u'5134515', country=u'Switzerland', city=u'Affoltern am Albis', room_type=u'Entire home/apt', bedrooms=u'2.0', bathrooms=None, price=u'129.0', reviews=u'4', overall_satisfaction=u'4.5', latitude=u'47.290284', longitude=u'8.44185', latitude=u'47.281224', longitude=u'8.45346', population=None, region=u'01') Row(room_id=u'19144898', country=u'Switzerland', city=u'Affoltern am Albis', room_type=u'Entire home/apt', bedrooms=u'1.0', bathrooms=None, price=u'48.0', reviews=u'0', overall_satisfaction=u'0.0', latitude=u'47.278981', longitude=u'8.447358', latitude=u'47.281224', longitude=u'8.45346', population=None, region=u'01') Row(room_id=u'12013387', country=u'Switzerland', city=u'Affoltern am Albis', room_type=u'Entire home/apt', bedrooms=u'2.0', bathrooms=None, price=u'78.0', reviews=u'6', overall_satisfaction=u'5.0', latitude=u'47.276212', longitude=u'8.449886', latitude=u'47.281224', longitude=u'8.45346', population=None, region=u'01') Computed 27744 positions (from collected list) Exception in thread Thread-1 (most likely raised during interpreter shutdown): Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner File "/usr/lib/python2.7/threading.py", line 754, in run File "/usr/lib/python2.7/SocketServer.py", line 230, in serve_forever : 'NoneType' object has no attribute 'select' 17/09/22 10:47:13 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 192.168.10.12:33829 in memory (size: 5.8 KB, free: 366.3 MB) 17/09/22 10:47:13 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 192.168.10.11:44227 in memory (size: 5.8 KB, free: 366.3 MB) 17/09/22 10:47:13 INFO SparkContext: Invoking stop() from shutdown hook 17/09/22 10:47:13 INFO SparkUI: Stopped Spark web UI at http://192.168.10.10:4040 17/09/22 10:47:14 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 192.168.10.10:46475 in memory (size: 5.8 KB, free: 246.9 MB) 17/09/22 10:47:14 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 192.168.10.10:36679 in memory (size: 5.8 KB, free: 366.3 MB) 17/09/22 10:47:14 INFO MesosCoarseGrainedSchedulerBackend: Shutting down all executors 17/09/22 10:47:14 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down 17/09/22 10:47:15 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_FINISHED 17/09/22 10:47:16 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 0 is now TASK_FINISHED 17/09/22 10:47:18 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 2 is now TASK_FINISHED I0922 10:47:18.327436 3269 sched.cpp:2021] Asked to stop the driver I0922 10:47:18.328963 2982 sched.cpp:1203] Stopping framework b192e864-8a9b-4ffc-94ab-953d2b929bd2-0001 17/09/22 10:47:18 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 17/09/22 10:47:18 INFO MesosCoarseGrainedSchedulerBackend: driver.run() returned with code DRIVER_STOPPED 17/09/22 10:47:18 INFO MemoryStore: MemoryStore cleared 17/09/22 10:47:18 INFO BlockManager: BlockManager stopped 17/09/22 10:47:18 INFO BlockManagerMaster: BlockManagerMaster stopped 17/09/22 10:47:18 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 17/09/22 10:47:18 INFO SparkContext: Successfully stopped SparkContext 17/09/22 10:47:18 INFO ShutdownHookManager: Shutdown hook called 17/09/22 10:47:18 INFO ShutdownHookManager: Deleting directory /tmp/spark-5c8f0db8-c88b-41ce-be0e-38720b5a9c96/pyspark-e20d54bf-c895-4b77-99b2-d32b75ab567c 17/09/22 10:47:18 INFO ShutdownHookManager: Deleting directory /tmp/spark-5c8f0db8-c88b-41ce-be0e-38720b5a9c96 === DONE === Deleting temporary folder === DONE