Running test 3 1 === Created temporary folder : /tmp/script_Um4ErsWooq === Copying script to temporary folder === DONE === Executing Script + curl -XDELETE 'localhost:9200/bank?pretty' + curl -XDELETE 'localhost:9200/shakespeare?pretty' + curl -XDELETE 'localhost:9200/apache-logs-*?pretty' + curl -XDELETE 'localhost:9200/swiss-*?pretty' + mkdir -p /home/mes/input_data + cd /home/mes/input_data + [[ ! -f shakespeare.json ]] + set -e + curl -XPUT 'localhost:9200/shakespeare?pretty' -H 'Content-Type: application/json' '-d { "settings": { "number_of_shards" : 10, "number_of_replicas" : 0 }, "mappings" : { "_default_" : { "properties" : { "speaker" : {"type": "keyword" }, "play_name" : {"type": "keyword" }, "line_id" : { "type" : "integer" }, "speech_number" : { "type" : "integer" } } } } } ' % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 10{ "acknowledged" : true, "shards_acknowledged" : true, "index" : "shakespeare" } 0 429 100 87 100 342 182 717 --:--:-- --:--:-- --:--:-- 718 + set +e + curl -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/shakespeare/_bulk?pretty' --data-binary @shakespeare.json + [[ 0 != 0 ]] + set +x Warning: Ignoring non-spark config property: es.nodes.data.only=false 17/09/21 15:22:04 INFO SparkContext: Running Spark version 2.2.0 17/09/21 15:22:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/09/21 15:22:05 INFO SparkContext: Submitted application: ESTest_3_1 17/09/21 15:22:05 INFO SecurityManager: Changing view acls to: mes 17/09/21 15:22:05 INFO SecurityManager: Changing modify acls to: mes 17/09/21 15:22:05 INFO SecurityManager: Changing view acls groups to: 17/09/21 15:22:05 INFO SecurityManager: Changing modify acls groups to: 17/09/21 15:22:05 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(mes); groups with view permissions: Set(); users with modify permissions: Set(mes); groups with modify permissions: Set() 17/09/21 15:22:06 INFO Utils: Successfully started service 'sparkDriver' on port 36533. 17/09/21 15:22:06 INFO SparkEnv: Registering MapOutputTracker 17/09/21 15:22:06 INFO SparkEnv: Registering BlockManagerMaster 17/09/21 15:22:06 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 17/09/21 15:22:06 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 17/09/21 15:22:06 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-9865c2d2-0c01-46f6-ac92-84d3dcb547f2 17/09/21 15:22:06 INFO MemoryStore: MemoryStore started with capacity 246.9 MB 17/09/21 15:22:06 INFO SparkEnv: Registering OutputCommitCoordinator 17/09/21 15:22:07 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/09/21 15:22:07 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.10.10:4040 17/09/21 15:22:08 INFO SparkContext: Added file file:/tmp/script_Um4ErsWooq/3_aggregation_1_es_shakespeare_rdd_legacy.py at spark://192.168.10.10:36533/files/3_aggregation_1_es_shakespeare_rdd_legacy.py with timestamp 1506007328201 17/09/21 15:22:08 INFO Utils: Copying /tmp/script_Um4ErsWooq/3_aggregation_1_es_shakespeare_rdd_legacy.py to /tmp/spark-006c6ba5-b4c2-492b-8300-b190d0d18e5b/userFiles-4027145b-7379-47c2-a5b6-b3c14aa607af/3_aggregation_1_es_shakespeare_rdd_legacy.py 2017-09-21 15:22:09,848:13760(0x7f9ead4e3700):ZOO_INFO@log_env@726: Client environment:zookeeper.version=zookeeper C client 3.4.8 2017-09-21 15:22:09,848:13760(0x7f9ead4e3700):ZOO_INFO@log_env@730: Client environment:host.name=mes_master 2017-09-21 15:22:09,848:13760(0x7f9ead4e3700):ZOO_INFO@log_env@737: Client environment:os.name=Linux 2017-09-21 15:22:09,848:13760(0x7f9ead4e3700):ZOO_INFO@log_env@738: Client environment:os.arch=4.9.0-3-amd64 2017-09-21 15:22:09,848:13760(0x7f9ead4e3700):ZOO_INFO@log_env@739: Client environment:os.version=#1 SMP Debian 4.9.30-2+deb9u3 (2017-08-06) 2017-09-21 15:22:09,850:13760(0x7f9ead4e3700):ZOO_INFO@log_env@747: Client environment:user.name=mes 2017-09-21 15:22:09,850:13760(0x7f9ead4e3700):ZOO_INFO@log_env@755: Client environment:user.home=/home/mes 2017-09-21 15:22:09,850:13760(0x7f9ead4e3700):ZOO_INFO@log_env@767: Client environment:user.dir=/tmp/script_Um4ErsWooq 2017-09-21 15:22:09,850:13760(0x7f9ead4e3700):ZOO_INFO@zookeeper_init@800: Initiating client connection, host=192.168.10.10:2181 sessionTimeout=10000 watcher=0x7f9eb5e6b712 sessionId=0 sessionPasswd= context=0x56062b1ac9c8 flags=0 2017-09-21 15:22:09,852:13760(0x7f9ea93da700):ZOO_INFO@check_events@1728: initiated connection to server [192.168.10.10:2181] I0921 15:22:09.857571 13902 sched.cpp:232] Version: 1.3.0 2017-09-21 15:22:09,869:13760(0x7f9ea93da700):ZOO_INFO@check_events@1775: session establishment complete on server [192.168.10.10:2181], sessionId=0x15ea4e649aa0010, negotiated timeout=10000 I0921 15:22:09.872509 13896 group.cpp:340] Group process (zookeeper-group(1)@192.168.10.10:38015) connected to ZooKeeper I0921 15:22:09.874343 13896 group.cpp:830] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0) I0921 15:22:09.874409 13896 group.cpp:418] Trying to create path '/mesos' in ZooKeeper I0921 15:22:09.918313 13896 detector.cpp:152] Detected a new leader: (id='19') I0921 15:22:09.921771 13896 group.cpp:699] Trying to get '/mesos/json.info_0000000019' in ZooKeeper I0921 15:22:09.926882 13899 zookeeper.cpp:262] A new leading master (UPID=master@192.168.10.10:5050) is detected I0921 15:22:09.927523 13895 sched.cpp:336] New master detected at master@192.168.10.10:5050 I0921 15:22:09.929133 13895 sched.cpp:352] No credentials provided. Attempting to register without authentication I0921 15:22:09.951315 13900 sched.cpp:759] Framework registered with 074830c5-66d9-4eaf-b7cf-a2a021070856-0007 17/09/21 15:22:09 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44099. 17/09/21 15:22:09 INFO NettyBlockTransferService: Server created on 192.168.10.10:44099 17/09/21 15:22:10 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 17/09/21 15:22:10 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.10.10, 44099, None) 17/09/21 15:22:10 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.10.10:44099 with 246.9 MB RAM, BlockManagerId(driver, 192.168.10.10, 44099, None) 17/09/21 15:22:10 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.10.10, 44099, None) 17/09/21 15:22:10 INFO BlockManager: external shuffle service port = 7337 17/09/21 15:22:10 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.10.10, 44099, None) 17/09/21 15:22:11 INFO EventLoggingListener: Logging events to file:/var/lib/spark/eventlog/074830c5-66d9-4eaf-b7cf-a2a021070856-0007 17/09/21 15:22:11 INFO Utils: Using initial executors = 0, max of spark.dynamicAllocation.initialExecutors, spark.dynamicAllocation.minExecutors and spark.executor.instances 17/09/21 15:22:11 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 0 17/09/21 15:22:11 INFO MesosCoarseGrainedSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 17/09/21 15:22:12 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 275.8 KB, free 246.6 MB) 17/09/21 15:22:13 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.1 KB, free 246.6 MB) 17/09/21 15:22:13 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.10.10:44099 (size: 23.1 KB, free: 246.9 MB) 17/09/21 15:22:13 INFO SparkContext: Created broadcast 0 from newAPIHadoopRDD at PythonRDD.scala:603 17/09/21 15:22:13 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 240.8 KB, free 246.4 MB) 17/09/21 15:22:13 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 23.1 KB, free 246.4 MB) 17/09/21 15:22:13 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.10.10:44099 (size: 23.1 KB, free: 246.9 MB) 17/09/21 15:22:13 INFO SparkContext: Created broadcast 1 from broadcast at PythonRDD.scala:584 17/09/21 15:22:13 INFO Version: Elasticsearch Hadoop v6.0.0-beta2 [66f16fdd93] 17/09/21 15:22:13 INFO EsInputFormat: Reading from [shakespeare] 17/09/21 15:22:14 INFO EsInputFormat: Created [10] splits 17/09/21 15:22:14 INFO SparkContext: Starting job: take at SerDeUtil.scala:203 17/09/21 15:22:14 INFO DAGScheduler: Got job 0 (take at SerDeUtil.scala:203) with 1 output partitions 17/09/21 15:22:14 INFO DAGScheduler: Final stage: ResultStage 0 (take at SerDeUtil.scala:203) 17/09/21 15:22:14 INFO DAGScheduler: Parents of final stage: List() 17/09/21 15:22:14 INFO DAGScheduler: Missing parents: List() 17/09/21 15:22:14 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at PythonHadoopUtil.scala:181), which has no missing parents 17/09/21 15:22:14 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.9 KB, free 246.3 MB) 17/09/21 15:22:14 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1770.0 B, free 246.3 MB) 17/09/21 15:22:14 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.10.10:44099 (size: 1770.0 B, free: 246.9 MB) 17/09/21 15:22:14 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006 17/09/21 15:22:14 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at PythonHadoopUtil.scala:181) (first 15 tasks are for partitions Vector(0)) 17/09/21 15:22:14 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 17/09/21 15:22:15 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 1 17/09/21 15:22:15 INFO ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 1) 17/09/21 15:22:15 WARN MesosCoarseGrainedSchedulerBackend: Unable to parse into a key:value label for the task. 17/09/21 15:22:16 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 0 is now TASK_RUNNING 17/09/21 15:22:16 INFO TransportClientFactory: Successfully created connection to /192.168.10.10:7337 after 44 ms (0 ms spent in bootstraps) 17/09/21 15:22:16 INFO MesosExternalShuffleClient: Successfully registered app 074830c5-66d9-4eaf-b7cf-a2a021070856-0007 with external shuffle service. 17/09/21 15:22:21 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.10.10:47858) with ID 0 17/09/21 15:22:21 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.10.10:36735 with 366.3 MB RAM, BlockManagerId(0, 192.168.10.10, 36735, None) 17/09/21 15:22:21 INFO ExecutorAllocationManager: New executor 0 has registered (new total is 1) 17/09/21 15:22:21 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, 192.168.10.10, executor 0, partition 0, ANY, 42531 bytes) 17/09/21 15:22:23 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.10.10:36735 (size: 1770.0 B, free: 366.3 MB) 17/09/21 15:22:24 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.10.10:36735 (size: 23.1 KB, free: 366.3 MB) 17/09/21 15:22:25 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 4389 ms on 192.168.10.10 (executor 0) (1/1) 17/09/21 15:22:25 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 17/09/21 15:22:25 INFO DAGScheduler: ResultStage 0 (take at SerDeUtil.scala:203) finished in 11.056 s 17/09/21 15:22:25 INFO DAGScheduler: Job 0 finished: take at SerDeUtil.scala:203, took 11.471589 s 17/09/21 15:22:26 INFO SparkContext: Starting job: runJob at PythonRDD.scala:446 17/09/21 15:22:26 INFO DAGScheduler: Got job 1 (runJob at PythonRDD.scala:446) with 1 output partitions 17/09/21 15:22:26 INFO DAGScheduler: Final stage: ResultStage 1 (runJob at PythonRDD.scala:446) 17/09/21 15:22:26 INFO DAGScheduler: Parents of final stage: List() 17/09/21 15:22:26 INFO DAGScheduler: Missing parents: List() 17/09/21 15:22:26 INFO DAGScheduler: Submitting ResultStage 1 (PythonRDD[3] at RDD at PythonRDD.scala:48), which has no missing parents 17/09/21 15:22:26 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 5.9 KB, free 246.3 MB) 17/09/21 15:22:26 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 3.6 KB, free 246.3 MB) 17/09/21 15:22:26 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.10.10:44099 (size: 3.6 KB, free: 246.8 MB) 17/09/21 15:22:26 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:1006 17/09/21 15:22:26 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (PythonRDD[3] at RDD at PythonRDD.scala:48) (first 15 tasks are for partitions Vector(0)) 17/09/21 15:22:26 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks 17/09/21 15:22:26 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, 192.168.10.10, executor 0, partition 0, ANY, 42531 bytes) 17/09/21 15:22:26 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 0 17/09/21 15:22:26 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.10.10:36735 (size: 3.6 KB, free: 366.3 MB) 17/09/21 15:22:27 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 1759 ms on 192.168.10.10 (executor 0) (1/1) 17/09/21 15:22:27 INFO DAGScheduler: ResultStage 1 (runJob at PythonRDD.scala:446) finished in 1.765 s 17/09/21 15:22:27 INFO DAGScheduler: Job 1 finished: runJob at PythonRDD.scala:446, took 1.865354 s 17/09/21 15:22:27 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 17/09/21 15:22:28 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/tmp/script_Um4ErsWooq/spark-warehouse'). 17/09/21 15:22:28 INFO SharedState: Warehouse path is 'file:/tmp/script_Um4ErsWooq/spark-warehouse'. 17/09/21 15:22:30 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 17/09/21 15:22:32 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 192.168.10.10:36735 in memory (size: 3.6 KB, free: 366.3 MB) 17/09/21 15:22:32 INFO BlockManagerInfo: Removed broadcast_3_piece0 on 192.168.10.10:44099 in memory (size: 3.6 KB, free: 246.9 MB) 17/09/21 15:22:37 INFO SparkContext: Starting job: collect at /tmp/script_Um4ErsWooq/3_aggregation_1_es_shakespeare_rdd_legacy.py:54 17/09/21 15:22:37 INFO DAGScheduler: Got job 2 (collect at /tmp/script_Um4ErsWooq/3_aggregation_1_es_shakespeare_rdd_legacy.py:54) with 10 output partitions 17/09/21 15:22:37 INFO DAGScheduler: Final stage: ResultStage 2 (collect at /tmp/script_Um4ErsWooq/3_aggregation_1_es_shakespeare_rdd_legacy.py:54) 17/09/21 15:22:37 INFO DAGScheduler: Parents of final stage: List() 17/09/21 15:22:37 INFO DAGScheduler: Missing parents: List() 17/09/21 15:22:37 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[9] at collect at /tmp/script_Um4ErsWooq/3_aggregation_1_es_shakespeare_rdd_legacy.py:54), which has no missing parents 17/09/21 15:22:37 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 13.0 KB, free 246.3 MB) 17/09/21 15:22:37 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 7.2 KB, free 246.3 MB) 17/09/21 15:22:37 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 192.168.10.10:44099 (size: 7.2 KB, free: 246.8 MB) 17/09/21 15:22:37 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1006 17/09/21 15:22:37 INFO DAGScheduler: Submitting 10 missing tasks from ResultStage 2 (MapPartitionsRDD[9] at collect at /tmp/script_Um4ErsWooq/3_aggregation_1_es_shakespeare_rdd_legacy.py:54) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)) 17/09/21 15:22:37 INFO TaskSchedulerImpl: Adding task set 2.0 with 10 tasks 17/09/21 15:22:37 INFO TaskSetManager: Starting task 2.0 in stage 2.0 (TID 2, 192.168.10.10, executor 0, partition 2, NODE_LOCAL, 42531 bytes) 17/09/21 15:22:37 INFO TaskSetManager: Starting task 5.0 in stage 2.0 (TID 3, 192.168.10.10, executor 0, partition 5, NODE_LOCAL, 42531 bytes) 17/09/21 15:22:37 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 192.168.10.10:36735 (size: 7.2 KB, free: 366.3 MB) 17/09/21 15:22:38 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 2 17/09/21 15:22:38 INFO ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 2) 17/09/21 15:22:39 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 3 17/09/21 15:22:39 INFO ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 3) 17/09/21 15:22:40 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 5 17/09/21 15:22:40 INFO ExecutorAllocationManager: Requesting 2 new executors because tasks are backlogged (new desired total will be 5) 17/09/21 15:22:41 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 9 17/09/21 15:22:41 INFO ExecutorAllocationManager: Requesting 4 new executors because tasks are backlogged (new desired total will be 9) 17/09/21 15:22:42 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 10 17/09/21 15:22:42 INFO ExecutorAllocationManager: Requesting 1 new executor because tasks are backlogged (new desired total will be 10) 17/09/21 15:22:43 INFO TaskSetManager: Starting task 8.0 in stage 2.0 (TID 4, 192.168.10.10, executor 0, partition 8, NODE_LOCAL, 42531 bytes) 17/09/21 15:22:43 INFO TaskSetManager: Finished task 5.0 in stage 2.0 (TID 3) in 6223 ms on 192.168.10.10 (executor 0) (1/10) 17/09/21 15:22:43 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 9 17/09/21 15:22:43 WARN MesosCoarseGrainedSchedulerBackend: Unable to parse into a key:value label for the task. 17/09/21 15:22:43 WARN MesosCoarseGrainedSchedulerBackend: Unable to parse into a key:value label for the task. 17/09/21 15:22:44 INFO TaskSetManager: Finished task 2.0 in stage 2.0 (TID 2) in 6920 ms on 192.168.10.10 (executor 0) (2/10) 17/09/21 15:22:44 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 8 17/09/21 15:22:44 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 2 is now TASK_RUNNING 17/09/21 15:22:44 INFO TransportClientFactory: Successfully created connection to /192.168.10.11:7337 after 28 ms (0 ms spent in bootstraps) 17/09/21 15:22:44 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_RUNNING 17/09/21 15:22:44 INFO TransportClientFactory: Successfully created connection to /192.168.10.12:7337 after 12 ms (0 ms spent in bootstraps) 17/09/21 15:22:44 INFO MesosExternalShuffleClient: Successfully registered app 074830c5-66d9-4eaf-b7cf-a2a021070856-0007 with external shuffle service. 17/09/21 15:22:44 INFO MesosExternalShuffleClient: Successfully registered app 074830c5-66d9-4eaf-b7cf-a2a021070856-0007 with external shuffle service. 2017-09-21 15:22:46,632:13760(0x7f9ea93da700):ZOO_WARN@zookeeper_interest@1570: Exceeded deadline by 16ms 2017-09-21 15:22:49,978:13760(0x7f9ea93da700):ZOO_WARN@zookeeper_interest@1570: Exceeded deadline by 11ms 2017-09-21 15:22:53,326:13760(0x7f9ea93da700):ZOO_WARN@zookeeper_interest@1570: Exceeded deadline by 14ms 17/09/21 15:22:54 INFO TaskSetManager: Finished task 8.0 in stage 2.0 (TID 4) in 10625 ms on 192.168.10.10 (executor 0) (3/10) 17/09/21 15:22:54 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 7 17/09/21 15:22:54 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.10.12:59284) with ID 1 17/09/21 15:22:54 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 5, 192.168.10.12, executor 1, partition 0, NODE_LOCAL, 42531 bytes) 17/09/21 15:22:54 INFO TaskSetManager: Starting task 3.0 in stage 2.0 (TID 6, 192.168.10.12, executor 1, partition 3, NODE_LOCAL, 42531 bytes) 17/09/21 15:22:54 INFO ExecutorAllocationManager: New executor 1 has registered (new total is 2) 17/09/21 15:22:54 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.10.11:56882) with ID 2 17/09/21 15:22:54 INFO ExecutorAllocationManager: New executor 2 has registered (new total is 3) 17/09/21 15:22:54 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 7, 192.168.10.11, executor 2, partition 1, NODE_LOCAL, 42531 bytes) 17/09/21 15:22:54 INFO TaskSetManager: Starting task 7.0 in stage 2.0 (TID 8, 192.168.10.11, executor 2, partition 7, NODE_LOCAL, 42531 bytes) 17/09/21 15:22:54 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.10.11:42105 with 366.3 MB RAM, BlockManagerId(2, 192.168.10.11, 42105, None) 17/09/21 15:22:54 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.10.12:46143 with 366.3 MB RAM, BlockManagerId(1, 192.168.10.12, 46143, None) 17/09/21 15:22:56 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 192.168.10.12:46143 (size: 7.2 KB, free: 366.3 MB) 17/09/21 15:22:56 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 192.168.10.11:42105 (size: 7.2 KB, free: 366.3 MB) 17/09/21 15:22:58 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.10.12:46143 (size: 23.1 KB, free: 366.3 MB) 17/09/21 15:22:59 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.10.11:42105 (size: 23.1 KB, free: 366.3 MB) 17/09/21 15:23:14 INFO TaskSetManager: Starting task 4.0 in stage 2.0 (TID 9, 192.168.10.10, executor 0, partition 4, ANY, 42531 bytes) 17/09/21 15:23:14 INFO TaskSetManager: Starting task 6.0 in stage 2.0 (TID 10, 192.168.10.10, executor 0, partition 6, ANY, 42531 bytes) 17/09/21 15:23:14 INFO TaskSetManager: Starting task 9.0 in stage 2.0 (TID 11, 192.168.10.11, executor 2, partition 9, NODE_LOCAL, 42531 bytes) 17/09/21 15:23:14 INFO TaskSetManager: Finished task 1.0 in stage 2.0 (TID 7) in 20414 ms on 192.168.10.11 (executor 2) (4/10) 17/09/21 15:23:15 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 6 17/09/21 15:23:15 INFO TaskSetManager: Finished task 3.0 in stage 2.0 (TID 6) in 21183 ms on 192.168.10.12 (executor 1) (5/10) 17/09/21 15:23:15 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 5 17/09/21 15:23:16 INFO TaskSetManager: Finished task 7.0 in stage 2.0 (TID 8) in 22384 ms on 192.168.10.11 (executor 2) (6/10) 17/09/21 15:23:17 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 4 17/09/21 15:23:17 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 5) in 23226 ms on 192.168.10.12 (executor 1) (7/10) 17/09/21 15:23:17 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 3 17/09/21 15:23:22 INFO TaskSetManager: Finished task 4.0 in stage 2.0 (TID 9) in 7452 ms on 192.168.10.10 (executor 0) (8/10) 17/09/21 15:23:22 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 2 17/09/21 15:23:22 INFO TaskSetManager: Finished task 6.0 in stage 2.0 (TID 10) in 7944 ms on 192.168.10.10 (executor 0) (9/10) 17/09/21 15:23:22 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 1 17/09/21 15:23:23 INFO TaskSetManager: Finished task 9.0 in stage 2.0 (TID 11) in 8353 ms on 192.168.10.11 (executor 2) (10/10) 17/09/21 15:23:23 INFO DAGScheduler: ResultStage 2 (collect at /tmp/script_Um4ErsWooq/3_aggregation_1_es_shakespeare_rdd_legacy.py:54) finished in 46.074 s 17/09/21 15:23:23 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 17/09/21 15:23:23 INFO DAGScheduler: Job 2 finished: collect at /tmp/script_Um4ErsWooq/3_aggregation_1_es_shakespeare_rdd_legacy.py:54, took 46.179457 s 17/09/21 15:23:23 INFO MesosCoarseGrainedSchedulerBackend: Capping the total amount of executors to 0 17/09/21 15:23:27 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 192.168.10.10:44099 in memory (size: 1770.0 B, free: 246.8 MB) 17/09/21 15:23:28 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 192.168.10.10:36735 in memory (size: 1770.0 B, free: 366.3 MB) 17/09/21 15:23:28 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 192.168.10.10:36735 in memory (size: 7.2 KB, free: 366.3 MB) 17/09/21 15:23:28 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 192.168.10.10:44099 in memory (size: 7.2 KB, free: 246.9 MB) 17/09/21 15:23:28 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 192.168.10.11:42105 in memory (size: 7.2 KB, free: 366.3 MB) 17/09/21 15:23:28 INFO BlockManagerInfo: Removed broadcast_4_piece0 on 192.168.10.12:46143 in memory (size: 7.2 KB, free: 366.3 MB) Printing 10 first results Row(_1=u'11', _2={u'line_number': u'1.1.9', u'speech_number': u'1', u'play_name': u'Henry IV', u'text_entry': u'Of hostile paces: those opposed eyes,', u'line_id': u'12', u'speaker': u'KING HENRY IV'}) Row(_1=u'47', _2={u'line_number': u'1.1.45', u'speech_number': u'2', u'play_name': u'Henry IV', u'text_entry': u'By those Welshwomen done as may not be', u'line_id': u'48', u'speaker': u'WESTMORELAND'}) Row(_1=u'50', _2={u'line_number': u'1.1.48', u'speech_number': u'3', u'play_name': u'Henry IV', u'text_entry': u'Brake off our business for the Holy Land.', u'line_id': u'51', u'speaker': u'KING HENRY IV'}) Row(_1=u'53', _2={u'line_number': u'1.1.51', u'speech_number': u'4', u'play_name': u'Henry IV', u'text_entry': u'Came from the north and thus it did import:', u'line_id': u'54', u'speaker': u'WESTMORELAND'}) Row(_1=u'56', _2={u'line_number': u'1.1.54', u'speech_number': u'4', u'play_name': u'Henry IV', u'text_entry': u'That ever-valiant and approved Scot,', u'line_id': u'57', u'speaker': u'WESTMORELAND'}) Row(_1=u'58', _2={u'line_number': u'1.1.56', u'speech_number': u'4', u'play_name': u'Henry IV', u'text_entry': u'Where they did spend a sad and bloody hour,', u'line_id': u'59', u'speaker': u'WESTMORELAND'}) Row(_1=u'68', _2={u'line_number': u'1.1.66', u'speech_number': u'5', u'play_name': u'Henry IV', u'text_entry': u'And he hath brought us smooth and welcome news.', u'line_id': u'69', u'speaker': u'KING HENRY IV'}) Row(_1=u'69', _2={u'line_number': u'1.1.67', u'speech_number': u'5', u'play_name': u'Henry IV', u'text_entry': u'The Earl of Douglas is discomfited:', u'line_id': u'70', u'speaker': u'KING HENRY IV'}) Row(_1=u'74', _2={u'line_number': u'1.1.72', u'speech_number': u'5', u'play_name': u'Henry IV', u'text_entry': u'To beaten Douglas; and the Earl of Athol,', u'line_id': u'75', u'speaker': u'KING HENRY IV'}) Row(_1=u'75', _2={u'line_number': u'1.1.73', u'speech_number': u'5', u'play_name': u'Henry IV', u'text_entry': u'Of Murray, Angus, and Menteith:', u'line_id': u'76', u'speaker': u'KING HENRY IV'}) Fetched 110487 rows (from collected list) 17/09/21 15:23:41 INFO SparkContext: Invoking stop() from shutdown hook 17/09/21 15:23:42 INFO SparkUI: Stopped Spark web UI at http://192.168.10.10:4040 17/09/21 15:23:42 INFO MesosCoarseGrainedSchedulerBackend: Shutting down all executors 17/09/21 15:23:42 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down 17/09/21 15:23:43 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 2 is now TASK_FINISHED 17/09/21 15:23:44 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_FINISHED 17/09/21 15:23:45 ERROR LiveListenerBus: SparkListenerBus has already stopped! Dropping event SparkListenerExecutorMetricsUpdate(0,WrappedArray()) 17/09/21 15:23:49 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 0 is now TASK_FINISHED I0921 15:23:49.323814 14450 sched.cpp:2021] Asked to stop the driver I0921 15:23:49.334317 13899 sched.cpp:1203] Stopping framework 074830c5-66d9-4eaf-b7cf-a2a021070856-0007 17/09/21 15:23:49 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 17/09/21 15:23:49 INFO MesosCoarseGrainedSchedulerBackend: driver.run() returned with code DRIVER_STOPPED 17/09/21 15:23:49 INFO MemoryStore: MemoryStore cleared 17/09/21 15:23:49 INFO BlockManager: BlockManager stopped 17/09/21 15:23:49 INFO BlockManagerMaster: BlockManagerMaster stopped 17/09/21 15:23:50 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 17/09/21 15:23:50 INFO SparkContext: Successfully stopped SparkContext 17/09/21 15:23:50 INFO ShutdownHookManager: Shutdown hook called 17/09/21 15:23:50 INFO ShutdownHookManager: Deleting directory /tmp/spark-006c6ba5-b4c2-492b-8300-b190d0d18e5b 17/09/21 15:23:50 INFO ShutdownHookManager: Deleting directory /tmp/spark-006c6ba5-b4c2-492b-8300-b190d0d18e5b/pyspark-30afa90f-a626-4662-b774-3ff1b5e971da === DONE === Deleting temporary folder === DONE