Is the Bunsen Python API still maintained? Issues with bunsen Python API (bunsen v 0.5.11) and pyspark v. 3.2.1 #114

jasminziegler · 2022-03-22T16:42:01Z

Is the Bunsen Python API still under maintainence? Is Bunsen in general still maintained?

Description of Issue

We are trying to connect Kafka, Spark (Pyspark) and Bunsen for our project. Reading FHIR resources from kafka into Spark works already - we are getting a pyspark.sql.dataframe.DataFrame. Now we are facing issues with Bunsen when trying to call "extract_entry"

Error Message:

TypeError Traceback (most recent call last)
Input In [14], in <cell line: 1>()
----> 1 bundles = extract_entry(spark, from_json(mydf, 'value'), 'condition')

File /usr/local/bunsen/python/bunsen/r4/bundles.py:44, in from_json(df, column)
32 def from_json(df, column):
33 """
34 Takes a dataframe with JSON-encoded bundles in the given column and returns
35 a Java RDD of Bundle records. Note this
(...)
42 :return: a Java RDD of bundles for use with :func:extract_entry
43 """
---> 44 bundles = _bundles(df._sc._jvm)
45 return bundles.fromJson(df._jdf, column)

File /usr/local/bunsen/python/bunsen/r4/bundles.py:15, in _bundles(jvm)
14 def _bundles(jvm):
---> 15 return jvm.com.cerner.bunsen.Bundles.forR4()

TypeError: 'JavaPackage' object is not callable

System Configuration

BUNSEN_VERSION=0.5.11
Python 3.8
SPARK_VERSION=3.2.1
SPARK_SCALA_VERSION=2.12
PYSPARK_SUBMIT_ARGS="
--jars /usr/local/bunsen/jars/bunsen-spark-shaded-0.5.11.jar
--packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.1
pyspark-shell"

Additional Details (optional)

Expected Outcomes

We were expecting to be able to inspect and load the FHIR bundles with help of Bunsen.