DB-12042 JMS Input Source #32

jpanko1 · 2021-07-07T15:13:57Z

Short Description

Support JMS input for spark structured streaming.

Long Description

This code was added to be able to read data from IBM MQ via JMS.

How to test

Bob had a server set up and generating data that the streaming code could read.

jaceklaskowski · 2021-07-07T16:56:56Z

spark3.0/build.sbt

 name := "splice-machine-spark-connector"

-val spliceVersion = "3.2.0.2001-SNAPSHOT"
+val spliceVersion = "3.1.0.2016"


Looks suspicious. A downgrade?

It was changed to get in sync with the version of the Splice DB in the environment where this would be running.

jaceklaskowski · 2021-07-07T17:01:20Z

spark3.0/build.sbt

  ExclusionRule(organization = "org.scala-lang.modules", name = "scala-parser-combinators_2.11")
 )

+val excludedDeps = excludedDepsNonSpark ++ Seq(


The following is going to be a bit faster (prepend to a head) and shorter code-wise?

val excludedDeps = ExclusionRule(organization = "org.apache.spark") +: excludedDepsNonSpark

Updated in commit b79cb27

jaceklaskowski

Some more comments. In general, the code looks very old and could benefit from some polishing here and there.

jaceklaskowski · 2021-07-08T12:56:14Z

spark3.0/src/main/scala/org/apache/spark/sql/jms/JmsStreamingSource.scala

+      }
+    }
+    import org.apache.spark.unsafe.types.UTF8String._
+    val internalRDD = messageList.map(message => InternalRow(


The code looks very old(ish). InternalRow conversion is not needed (as it's in in-process memory anyway). Just convert JmsMessage to whatever tuple you want and simply Seq(...).toDF(...).

jaceklaskowski · 2021-07-08T12:57:09Z

spark3.0/src/main/scala/com/splicemachine/spark/jms/SparkApp.scala

+
+    val query = stream.writeStream
+      .outputMode("append")
+      .format("console")


memory format would help you with automated testing.

jaceklaskowski · 2021-07-08T12:57:58Z

spark3.0/src/main/scala/com/splicemachine/spark/jms/JmsSourceRdd.scala

+  */
+class JmsSourceRdd(sc:SparkContext) extends RDD[Message](sc, Nil){
+
+  override def compute(split: Partition, context: TaskContext): Iterator[Message] = ???


Is this class ever used given these ????

jaceklaskowski · 2021-07-08T12:58:49Z

spark3.0/src/main/scala/com/splicemachine/spark/jms/JmsSourceOffset.scala

+/**
+  * Created by exa00015 on 26/12/18.
+  */
+case class JmsSourceOffset(val id:Long) extends Offset {


There's a LongOffset in Spark Structured Streaming already.

jaceklaskowski · 2021-07-08T13:01:47Z

spark3.0/src/main/scala/com/splicemachine/spark/jms/JmsDataSourceRelation.scala

+
+
+  override def schema: StructType = {
+    ScalaReflection.schemaFor[JmsMessage].dataType.asInstanceOf[StructType]


What a trick! I think Encoders.product[JmsMessage].schema could work. If so, use it below to create a DataFrame out of JmsMessages.

jaceklaskowski · 2021-07-08T13:02:06Z

spark3.0/src/main/scala/com/splicemachine/spark/jms/JmsDataSourceRelation.scala

+  */
+class JmsDatasourceRelation(override val sqlContext: SQLContext, parameters: Map[String, String]) extends BaseRelation with TableScan with Serializable {
+
+  lazy val RECIEVER_TIMEOUT = parameters.getOrElse("reciever.timeout","3000").toLong


A typo in reciever

jaceklaskowski · 2021-07-08T13:02:48Z

spark3.0/src/main/scala/com/splicemachine/spark/jms/DefaultSource.scala

+    case "amq" => new AMQConnectionFactoryProvider().createConnection(parameters)
+    case "ibmmq" => new IBMMQConnectionFactoryProvider().createConnection(parameters)
+    case "rmq" => new RMQConnectionFactoryProvider().createConnection(parameters)
+    case "kafka" => new KafkaConnectionFactoryProvider().createConnection(parameters)


We don't need kafka as it's among the built-in data sources.

martinrupp · 2021-11-15T15:02:17Z

see you!

jpanko1 added 2 commits July 7, 2021 10:09

DB-12042 JMS source.

92f6e88

DB-12042 Build script updates for JMS code.

fd12d28

jpanko1 requested review from OlegMazurov, arnaud-lacurie, ascend1, bklo94, carolp-503, dgomezferro, hatyo, ipraznik-splice, jyuanca, martinrupp, msirek and yxia92 as code owners July 7, 2021 15:13

jaceklaskowski reviewed Jul 7, 2021

View reviewed changes

jpanko1 added 2 commits July 7, 2021 13:48

DB-12042 Added license.

04144a0

DB-12042 Updated build script for excluded deps.

b79cb27

jaceklaskowski reviewed Jul 8, 2021

View reviewed changes

martinrupp removed their request for review November 15, 2021 15:02

arnaud-lacurie removed their request for review January 19, 2022 09:54

dgomezferro removed their request for review March 30, 2022 18:21



		override def schema: StructType = {
		ScalaReflection.schemaFor[JmsMessage].dataType.asInstanceOf[StructType]

DB-12042 JMS Input Source #32

Are you sure you want to change the base?

DB-12042 JMS Input Source #32

Uh oh!

Conversation

jpanko1 commented Jul 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Short Description

Long Description

How to test

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jaceklaskowski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martinrupp commented Nov 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

jpanko1 commented Jul 7, 2021 •

edited

Loading