Skip to content

feat: data source options #535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Jun 26, 2025
Merged

feat: data source options #535

merged 30 commits into from
Jun 26, 2025

Conversation

shehabgamin
Copy link
Contributor

@shehabgamin shehabgamin commented Jun 16, 2025

Future work is needed to support Spark’s specific Parquet global options, as well as DataFusion’s column_specific_options and key_value_metadata.

Copy link

github-actions bot commented Jun 16, 2025

Spark Test Report

Commit Information

Commit Revision Branch
After 49795a6 refs/pull/535/merge
Before ddaeea1 refs/heads/main

Test Summary

Suite Commit Failed Passed Skipped Warnings Time (s)
doctest-catalog After 13 12 3 5.57
Before 13 12 3 5.64
doctest-column After 33 3 6.05
Before 33 3 5.97
doctest-dataframe After 28 78 1 4 9.02
Before 28 78 1 4 9.06
doctest-functions After 136 266 7 8 15.09
Before 136 266 7 8 15.06
test-connect After 219 817 135 283 142.07
Before 219 817 135 283 140.20

Test Details

Error Counts
          396 Total
(+1)      206 Total Unique
-------- ---- ----------------------------------------------------------------------------------------------------------
           24 DocTestFailure
           15 UnsupportedOperationException: streaming query manager command
           14 UnsupportedOperationException: lambda function
           13 AssertionError: AnalysisException not raised
           10 UnsupportedOperationException: unsupported data source format: "text"
           10 handle add artifacts
            9 PySparkAssertionError: [DIFFERENT_PANDAS_DATAFRAME] DataFrames are not almost equal:
            8 AssertionError: False is not true
            8 UnsupportedOperationException: hint
            6 AnalysisException: Cannot cast to Decimal128(14, 7). Overflowing on NaN
            6 UnsupportedOperationException: function: window
            6 UnsupportedOperationException: write stream operation start
            5 UnsupportedOperationException: function: monotonically_increasing_id
            5 UnsupportedOperationException: sample
            4 AssertionError: "TABLE_OR_VIEW_NOT_FOUND" does not match "No table named 'v'"
            4 PySparkNotImplementedError: [NOT_IMPLEMENTED] rdd() is not implemented.
            4 UnsupportedOperationException: sample by
            4 UnsupportedOperationException: unknown aggregate function: hll_sketch_agg
            4 UnsupportedOperationException: unpivot
            3 IllegalArgumentException: invalid argument: empty data source paths
(+3)        3 IllegalArgumentException: invalid argument: unsupported file format: TEXT
            3 UnsupportedOperationException: PlanNode::CacheTable
            3 UnsupportedOperationException: function: input_file_name
            3 UnsupportedOperationException: function: ~
            3 UnsupportedOperationException: handle analyze input files
            3 ValueError: Converting to Python dictionary is not supported when duplicate field names are present
            2 AnalysisException: Could not find config namespace "spark"
            2 AnalysisException: map requires all value types to be the same
            2 AnalysisException: two values expected: [Column(Column { relation: None, name: "#2" }), Column(Colum...
            2 AssertionError
            2 AssertionError: AnalysisException not raised by <lambda>
            2 AssertionError: Lists differ: [Row([22 chars](key=1, value='1'), Row(key=10, value='10'), R[2402 cha...
            2 IllegalArgumentException: expected value at line 1 column 1
            2 IllegalArgumentException: invalid argument: found FUNCTION at 5:13 expected 'DATABASE', 'SCHEMA', 'T...
            2 PythonException:  ZeroDivisionError: division by zero
            2 SparkRuntimeException: start_from index out of bounds
            2 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
(+2)        2 UnsupportedOperationException: Data Source option 'line_sep' is not supported yet.
(+2)        2 UnsupportedOperationException: Data Source option 'multi_line' is not supported yet.
            2 UnsupportedOperationException: approx quantile
            2 UnsupportedOperationException: collect metrics
            2 UnsupportedOperationException: freq items
            2 UnsupportedOperationException: function: bitmap_bit_position
            2 UnsupportedOperationException: function: format_number
            2 UnsupportedOperationException: function: from_csv
            2 UnsupportedOperationException: function: from_json
            2 UnsupportedOperationException: function: inline
            2 UnsupportedOperationException: function: map_entries
            2 UnsupportedOperationException: function: sec
            2 UnsupportedOperationException: function: shiftrightunsigned
            2 UnsupportedOperationException: handle analyze is local
            2 UnsupportedOperationException: handle analyze same semantics
            2 UnsupportedOperationException: list functions
            2 UnsupportedOperationException: pivot
            2 UnsupportedOperationException: position with 3 arguments is not supported yet
            2 UnsupportedOperationException: rebalance partitioning by expression
            2 UnsupportedOperationException: unknown aggregate function: collect_set
            2 UnsupportedOperationException: unresolved regex
            2 UnsupportedOperationException: unsupported data source format: "orc"
            2 UnsupportedOperationException: user defined data type should only exist in a field
            2 handle artifact statuses
            2 received metadata size exceeds hard limit (19714 vs. 16384);  :status:42B content-type:60B grpc-stat...
            1 AnalysisException: Cannot cast string 'abc' to value of Float64 type
            1 AnalysisException: Cannot cast value 'abc' to value of Boolean type
            1 AnalysisException: Cannot infer common argument type for comparison operation Boolean = Float64
            1 AnalysisException: Error parsing timestamp from '2023-01-01' using format '%d-%m-%Y': input contains...
            1 AnalysisException: Failed to coerce arguments to satisfy a call to 'nth_value' function: coercion fr...
            1 AnalysisException: Failed to parse placeholder id: cannot parse integer from empty string
            1 AnalysisException: Inconsistent data type across values list at row 1 column 1. Was Map(Field { name...
            1 AnalysisException: Table 'tbl1' already exists
            1 AnalysisException: UNION queries have different number of columns: left has 3 columns whereas right ...
            1 AnalysisException: view not found: tab2
            1 AssertionError: "2000000" does not match "raise_error expects a single UTF-8 string argument"
(+1)        1 AssertionError: "CSV header does not conform to the schema" does not match "Data Source option 'enfo...
(+1)        1 AssertionError: "Database 'memory:282240ec-659a-45e5-8ec4-ed95abae131a' dropped." does not match "in...
(+1)        1 AssertionError: "Database 'memory:f0ae7acb-7990-4c3f-8889-26c7d30ebdfa' dropped." does not match "in...
            1 AssertionError: "TABLE_OR_VIEW_NOT_FOUND" does not match "The table test_table already exists"
            1 AssertionError: "attribute.*missing" does not match "cannot resolve attribute: ObjectName([Identifie...
            1 AssertionError: "foobar" does not match "raise_error expects a single UTF-8 string argument"
            1 AssertionError: '+---[17 chars]-----+\n|                        x|\n+--------[132 chars]-+\n' != '+-...
            1 AssertionError: ArrayIndexOutOfBoundsException not raised
            1 AssertionError: Exception not raised
            1 AssertionError: Lists differ: [Row([14 chars] _c1=25, _c2='I am Hyukjin\n\nI love Spark!'),[86 chars...
            1 AssertionError: Lists differ: [Row(id=90, name='90'), Row(id=91, name='91'), Ro[176 chars]99')] != [...
            1 AssertionError: Lists differ: [Row(key='0'), Row(key='1'), Row(key='10'), Row(ke[1435 chars]99')] !=...
            1 AssertionError: Lists differ: [Row(ln(id)=0.0, ln(id)=0.0, struct(id, name)=Row(id=[1232 chars]0'))]...
            1 AssertionError: Row(point='[1.0, 2.0]', pypoint='[3.0, 4.0]') != Row(point='(1.0, 2.0)', pypoint='[3...
            1 AssertionError: StorageLevel(False, True, True, False, 1) != StorageLevel(False, False, False, False...
            1 AssertionError: Struc[30 chars]estampType(), True), StructField('val', IntegerType(), True)]) != Str...
            1 AssertionError: Struc[32 chars]e(), False), StructField('b', DoubleType(), Fa[158 chars]ue)]) != Str...
            1 AssertionError: Struc[40 chars]ue), StructField('val', ArrayType(DoubleType(), False), True)]) != St...
            1 AssertionError: Struc[64 chars]Type(), True), StructField('i', StringType(), True)]), False)]) != St...
            1 AssertionError: Struc[69 chars]e(), True), StructField('name', StringType(), True)]), True)]) != Str...
            1 AssertionError: YearMonthIntervalType(0, 1) != YearMonthIntervalType(0, 0)
            1 AssertionError: [1.0, 2.0] != ExamplePoint(1.0,2.0)
            1 AttributeError: 'DataFrame' object has no attribute '_ipython_key_completions_'
            1 AttributeError: 'DataFrame' object has no attribute '_joinAsOf'
(+1)        1 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpxt1s_dql'
            1 IllegalArgumentException: 83140 is too large to store in a Decimal128 of precision 4. Max is 9999
            1 IllegalArgumentException: column types must match schema types, expected Int64 but found List(Field ...
            1 IllegalArgumentException: column types must match schema types, expected LargeUtf8 but found Utf8 at...
            1 IllegalArgumentException: invalid argument: found FUNCTION at 7:15 expected 'DATABASE', 'SCHEMA', 'O...
            1 IllegalArgumentException: invalid argument: invalid digit found in string
(+1)        1 IllegalArgumentException: invalid argument: missing source
            1 ParseException: Error parsing timestamp from '1997/02/28 10:30:00': error parsing date
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] foreach() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] foreachPartition() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] localCheckpoint() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] sparkContext() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] toJSON() is not implemented.
            1 PythonException:  AttributeError: 'NoneType' object has no attribute 'partitionId'
            1 PythonException:  AttributeError: 'list' object has no attribute 'x'
            1 PythonException:  AttributeError: 'list' object has no attribute 'y'
            1 SparkRuntimeException: Failed due to a difference in schemas: original schema: DFSchema { inner: Sch...
            1 UnknownTimeZoneError: 'PST'
            1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
            1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
            1 UnsupportedOperationException: COUNT DISTINCT with multiple arguments
(+1)        1 UnsupportedOperationException: Data Source option 'ignore_leading_white_space' is not supported yet.
(+1)        1 UnsupportedOperationException: Data Source option 'primitives_as_string' is not supported yet.
            1 UnsupportedOperationException: Insert into not implemented for this table
            1 UnsupportedOperationException: PlanNode::ClearCache
            1 UnsupportedOperationException: PlanNode::IsCached
            1 UnsupportedOperationException: SHOW FUNCTIONS
            1 UnsupportedOperationException: bucketing
            1 UnsupportedOperationException: deduplicate within watermark
            1 UnsupportedOperationException: function exists
            1 UnsupportedOperationException: function: array_insert
            1 UnsupportedOperationException: function: array_sort
            1 UnsupportedOperationException: function: arrays_zip
            1 UnsupportedOperationException: function: bit_count
            1 UnsupportedOperationException: function: bit_get
            1 UnsupportedOperationException: function: bitmap_bucket_number
            1 UnsupportedOperationException: function: bitmap_count
            1 UnsupportedOperationException: function: bround
            1 UnsupportedOperationException: function: conv
            1 UnsupportedOperationException: function: convert_timezone
            1 UnsupportedOperationException: function: csc
            1 UnsupportedOperationException: function: elt
            1 UnsupportedOperationException: function: format_string
            1 UnsupportedOperationException: function: getbit
            1 UnsupportedOperationException: function: inline_outer
            1 UnsupportedOperationException: function: java_method
            1 UnsupportedOperationException: function: json_object_keys
            1 UnsupportedOperationException: function: json_tuple
            1 UnsupportedOperationException: function: make_dt_interval
            1 UnsupportedOperationException: function: make_interval
            1 UnsupportedOperationException: function: make_timestamp_ltz
            1 UnsupportedOperationException: function: map_concat
            1 UnsupportedOperationException: function: map_from_entries
            1 UnsupportedOperationException: function: months_between
            1 UnsupportedOperationException: function: parse_url
            1 UnsupportedOperationException: function: printf
            1 UnsupportedOperationException: function: reflect
            1 UnsupportedOperationException: function: regexp_extract
            1 UnsupportedOperationException: function: regexp_extract_all
            1 UnsupportedOperationException: function: regexp_instr
            1 UnsupportedOperationException: function: regexp_substr
            1 UnsupportedOperationException: function: schema_of_csv
            1 UnsupportedOperationException: function: schema_of_json
            1 UnsupportedOperationException: function: sentences
            1 UnsupportedOperationException: function: session_window
            1 UnsupportedOperationException: function: soundex
            1 UnsupportedOperationException: function: spark_partition_id
            1 UnsupportedOperationException: function: split
            1 UnsupportedOperationException: function: stack
            1 UnsupportedOperationException: function: str_to_map
            1 UnsupportedOperationException: function: to_char
            1 UnsupportedOperationException: function: to_csv
            1 UnsupportedOperationException: function: to_json
            1 UnsupportedOperationException: function: to_number
            1 UnsupportedOperationException: function: to_unix_timestamp
            1 UnsupportedOperationException: function: to_utc_timestamp
            1 UnsupportedOperationException: function: to_varchar
            1 UnsupportedOperationException: function: try_add
            1 UnsupportedOperationException: function: try_divide
            1 UnsupportedOperationException: function: try_multiply
            1 UnsupportedOperationException: function: try_subtract
            1 UnsupportedOperationException: function: try_to_number
            1 UnsupportedOperationException: function: url_decode
            1 UnsupportedOperationException: function: url_encode
            1 UnsupportedOperationException: function: width_bucket
            1 UnsupportedOperationException: function: xpath
            1 UnsupportedOperationException: function: xpath_boolean
            1 UnsupportedOperationException: function: xpath_double
            1 UnsupportedOperationException: function: xpath_float
            1 UnsupportedOperationException: function: xpath_int
            1 UnsupportedOperationException: function: xpath_long
            1 UnsupportedOperationException: function: xpath_number
            1 UnsupportedOperationException: function: xpath_short
            1 UnsupportedOperationException: function: xpath_string
            1 UnsupportedOperationException: handle analyze semantic hash
            1 UnsupportedOperationException: make_timestamp with timezone is not yet implemented
            1 UnsupportedOperationException: unknown aggregate function: bitmap_or_agg
            1 UnsupportedOperationException: unknown aggregate function: count_if
            1 UnsupportedOperationException: unknown aggregate function: count_min_sketch
            1 UnsupportedOperationException: unknown aggregate function: grouping_id
            1 UnsupportedOperationException: unknown aggregate function: histogram_numeric
            1 UnsupportedOperationException: unknown aggregate function: percentile
            1 UnsupportedOperationException: unknown aggregate function: try_avg
            1 UnsupportedOperationException: unknown aggregate function: try_sum
            1 UnsupportedOperationException: unknown function: distributed_sequence_id
            1 UnsupportedOperationException: unknown function: product
            1 ValueError: Code in Status proto (StatusCode.INTERNAL) doesn't match status code (StatusCode.RESOURC...
            1 ValueError: The column label 'id' is not unique.
            1 ValueError: The column label 'struct' is not unique.
(-3)        0 AnalysisException: Unable to find factory for TEXT
(-1)        0 AssertionError: "Database 'memory:3de04733-611a-49f7-8353-7581a01e4899' dropped." does not match "in...
(-1)        0 AssertionError: "Database 'memory:faae10a4-6d3d-4c2f-b815-6c43c01b54a0' dropped." does not match "in...
(-1)        0 AssertionError: Exception not raised by <lambda>
(-1)        0 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpt0bd_r50'
(-1)        0 ParseException: Error while parsing value 'id' as type 'Int32' for column 0 at line 0. Row data: '[i...
(-4)        0 UnsupportedOperationException: JSON data source read options are not supported yet
(-1)        0 UnsupportedOperationException: data writer option: ignoretrailingwhitespace
(-1)        0 UnsupportedOperationException: partitioning columns
Passed Tests Diff

(empty)

Failed Tests
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.cacheTable
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.clearCache
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.createTable
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.functionExists
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.getDatabase
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.getFunction
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.isCached
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.listDatabases
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.listFunctions
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.recoverPartitions
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.refreshByPath
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.refreshTable
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.uncacheTable
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame._ipython_key_completions_
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame._joinAsOf
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.checkpoint
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.coalesce
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.colRegex
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.dropDuplicatesWithinWatermark
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.explain
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.foreach
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.foreachPartition
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.hint
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.inputFiles
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.isLocal
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.isStreaming
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.localCheckpoint
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.observe
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.randomSplit
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.rdd
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.repartition
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.repartitionByRange
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.sameSemantics
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.sample
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.sampleBy
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.storageLevel
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.toJSON
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.unpivot
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.withWatermark
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.writeStream
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrameStatFunctions.sampleBy
pyspark/sql/functions.py::pyspark.sql.functions.aggregate
pyspark/sql/functions.py::pyspark.sql.functions.approx_percentile
pyspark/sql/functions.py::pyspark.sql.functions.array_insert
pyspark/sql/functions.py::pyspark.sql.functions.array_position
pyspark/sql/functions.py::pyspark.sql.functions.array_sort
pyspark/sql/functions.py::pyspark.sql.functions.array_union
pyspark/sql/functions.py::pyspark.sql.functions.arrays_zip
pyspark/sql/functions.py::pyspark.sql.functions.bit_count
pyspark/sql/functions.py::pyspark.sql.functions.bit_get
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_bit_position
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_bucket_number
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_construct_agg
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_count
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_or_agg
pyspark/sql/functions.py::pyspark.sql.functions.bitwise_not
pyspark/sql/functions.py::pyspark.sql.functions.broadcast
pyspark/sql/functions.py::pyspark.sql.functions.bround
pyspark/sql/functions.py::pyspark.sql.functions.collect_set
pyspark/sql/functions.py::pyspark.sql.functions.concat
pyspark/sql/functions.py::pyspark.sql.functions.conv
pyspark/sql/functions.py::pyspark.sql.functions.convert_timezone
pyspark/sql/functions.py::pyspark.sql.functions.count_distinct
pyspark/sql/functions.py::pyspark.sql.functions.count_if
pyspark/sql/functions.py::pyspark.sql.functions.count_min_sketch
pyspark/sql/functions.py::pyspark.sql.functions.csc
pyspark/sql/functions.py::pyspark.sql.functions.date_part
pyspark/sql/functions.py::pyspark.sql.functions.datepart
pyspark/sql/functions.py::pyspark.sql.functions.elt
pyspark/sql/functions.py::pyspark.sql.functions.exists
pyspark/sql/functions.py::pyspark.sql.functions.extract
pyspark/sql/functions.py::pyspark.sql.functions.filter
pyspark/sql/functions.py::pyspark.sql.functions.first
pyspark/sql/functions.py::pyspark.sql.functions.flatten
pyspark/sql/functions.py::pyspark.sql.functions.forall
pyspark/sql/functions.py::pyspark.sql.functions.format_number
pyspark/sql/functions.py::pyspark.sql.functions.format_string
pyspark/sql/functions.py::pyspark.sql.functions.from_csv
pyspark/sql/functions.py::pyspark.sql.functions.from_json
pyspark/sql/functions.py::pyspark.sql.functions.from_utc_timestamp
pyspark/sql/functions.py::pyspark.sql.functions.getbit
pyspark/sql/functions.py::pyspark.sql.functions.grouping_id
pyspark/sql/functions.py::pyspark.sql.functions.histogram_numeric
pyspark/sql/functions.py::pyspark.sql.functions.hll_sketch_agg
pyspark/sql/functions.py::pyspark.sql.functions.hll_sketch_estimate
pyspark/sql/functions.py::pyspark.sql.functions.hll_union
pyspark/sql/functions.py::pyspark.sql.functions.hll_union_agg
pyspark/sql/functions.py::pyspark.sql.functions.ilike
pyspark/sql/functions.py::pyspark.sql.functions.inline
pyspark/sql/functions.py::pyspark.sql.functions.inline_outer
pyspark/sql/functions.py::pyspark.sql.functions.input_file_block_length
pyspark/sql/functions.py::pyspark.sql.functions.input_file_block_start
pyspark/sql/functions.py::pyspark.sql.functions.input_file_name
pyspark/sql/functions.py::pyspark.sql.functions.java_method
pyspark/sql/functions.py::pyspark.sql.functions.json_object_keys
pyspark/sql/functions.py::pyspark.sql.functions.json_tuple
pyspark/sql/functions.py::pyspark.sql.functions.kurtosis
pyspark/sql/functions.py::pyspark.sql.functions.last
pyspark/sql/functions.py::pyspark.sql.functions.like
pyspark/sql/functions.py::pyspark.sql.functions.locate
pyspark/sql/functions.py::pyspark.sql.functions.make_dt_interval
pyspark/sql/functions.py::pyspark.sql.functions.make_interval
pyspark/sql/functions.py::pyspark.sql.functions.make_timestamp
pyspark/sql/functions.py::pyspark.sql.functions.make_timestamp_ltz
pyspark/sql/functions.py::pyspark.sql.functions.map_concat
pyspark/sql/functions.py::pyspark.sql.functions.map_entries
pyspark/sql/functions.py::pyspark.sql.functions.map_filter
pyspark/sql/functions.py::pyspark.sql.functions.map_from_entries
pyspark/sql/functions.py::pyspark.sql.functions.map_zip_with
pyspark/sql/functions.py::pyspark.sql.functions.median
pyspark/sql/functions.py::pyspark.sql.functions.mode
pyspark/sql/functions.py::pyspark.sql.functions.monotonically_increasing_id
pyspark/sql/functions.py::pyspark.sql.functions.months_between
pyspark/sql/functions.py::pyspark.sql.functions.parse_url
pyspark/sql/functions.py::pyspark.sql.functions.percentile
pyspark/sql/functions.py::pyspark.sql.functions.percentile_approx
pyspark/sql/functions.py::pyspark.sql.functions.position
pyspark/sql/functions.py::pyspark.sql.functions.printf
pyspark/sql/functions.py::pyspark.sql.functions.product
pyspark/sql/functions.py::pyspark.sql.functions.rand
pyspark/sql/functions.py::pyspark.sql.functions.randn
pyspark/sql/functions.py::pyspark.sql.functions.reduce
pyspark/sql/functions.py::pyspark.sql.functions.reflect
pyspark/sql/functions.py::pyspark.sql.functions.regexp_extract
pyspark/sql/functions.py::pyspark.sql.functions.regexp_extract_all
pyspark/sql/functions.py::pyspark.sql.functions.regexp_instr
pyspark/sql/functions.py::pyspark.sql.functions.regexp_substr
pyspark/sql/functions.py::pyspark.sql.functions.regr_avgy
pyspark/sql/functions.py::pyspark.sql.functions.regr_intercept
pyspark/sql/functions.py::pyspark.sql.functions.regr_r2
pyspark/sql/functions.py::pyspark.sql.functions.regr_slope
pyspark/sql/functions.py::pyspark.sql.functions.regr_sxy
pyspark/sql/functions.py::pyspark.sql.functions.regr_syy
pyspark/sql/functions.py::pyspark.sql.functions.schema_of_csv
pyspark/sql/functions.py::pyspark.sql.functions.schema_of_json
pyspark/sql/functions.py::pyspark.sql.functions.sec
pyspark/sql/functions.py::pyspark.sql.functions.sentences
pyspark/sql/functions.py::pyspark.sql.functions.session_window
pyspark/sql/functions.py::pyspark.sql.functions.shiftrightunsigned
pyspark/sql/functions.py::pyspark.sql.functions.skewness
pyspark/sql/functions.py::pyspark.sql.functions.soundex
pyspark/sql/functions.py::pyspark.sql.functions.spark_partition_id
pyspark/sql/functions.py::pyspark.sql.functions.split
pyspark/sql/functions.py::pyspark.sql.functions.stack
pyspark/sql/functions.py::pyspark.sql.functions.str_to_map
pyspark/sql/functions.py::pyspark.sql.functions.to_char
pyspark/sql/functions.py::pyspark.sql.functions.to_csv
pyspark/sql/functions.py::pyspark.sql.functions.to_json
pyspark/sql/functions.py::pyspark.sql.functions.to_number
pyspark/sql/functions.py::pyspark.sql.functions.to_unix_timestamp
pyspark/sql/functions.py::pyspark.sql.functions.to_utc_timestamp
pyspark/sql/functions.py::pyspark.sql.functions.to_varchar
pyspark/sql/functions.py::pyspark.sql.functions.transform
pyspark/sql/functions.py::pyspark.sql.functions.transform_keys
pyspark/sql/functions.py::pyspark.sql.functions.transform_values
pyspark/sql/functions.py::pyspark.sql.functions.try_add
pyspark/sql/functions.py::pyspark.sql.functions.try_avg
pyspark/sql/functions.py::pyspark.sql.functions.try_divide
pyspark/sql/functions.py::pyspark.sql.functions.try_multiply
pyspark/sql/functions.py::pyspark.sql.functions.try_subtract
pyspark/sql/functions.py::pyspark.sql.functions.try_sum
pyspark/sql/functions.py::pyspark.sql.functions.try_to_number
pyspark/sql/functions.py::pyspark.sql.functions.url_decode
pyspark/sql/functions.py::pyspark.sql.functions.url_encode
pyspark/sql/functions.py::pyspark.sql.functions.width_bucket
pyspark/sql/functions.py::pyspark.sql.functions.window
pyspark/sql/functions.py::pyspark.sql.functions.window_time
pyspark/sql/functions.py::pyspark.sql.functions.xpath
pyspark/sql/functions.py::pyspark.sql.functions.xpath_boolean
pyspark/sql/functions.py::pyspark.sql.functions.xpath_double
pyspark/sql/functions.py::pyspark.sql.functions.xpath_float
pyspark/sql/functions.py::pyspark.sql.functions.xpath_int
pyspark/sql/functions.py::pyspark.sql.functions.xpath_long
pyspark/sql/functions.py::pyspark.sql.functions.xpath_number
pyspark/sql/functions.py::pyspark.sql.functions.xpath_short
pyspark/sql/functions.py::pyspark.sql.functions.xpath_string
pyspark/sql/functions.py::pyspark.sql.functions.zip_with
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_archive
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_file
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_pyfile
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_zipped_package
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_basic_requests
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_cache_artifact
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_copy_from_local_to_fs
pyspark/sql/tests/connect/client/test_artifact.py::LocalClusterArtifactTests::test_add_archive
pyspark/sql/tests/connect/client/test_artifact.py::LocalClusterArtifactTests::test_add_file
pyspark/sql/tests/connect/client/test_artifact.py::LocalClusterArtifactTests::test_add_pyfile
pyspark/sql/tests/connect/client/test_artifact.py::LocalClusterArtifactTests::test_add_zipped_package
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_basic_open_process_close
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_invalid_writers
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_open_returning_false
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_process_throwing_error
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_simple_function
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_without_close_method
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_without_open_and_close_methods
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_without_open_method
pyspark/sql/tests/connect/streaming/test_parity_foreach_batch.py::StreamingForeachBatchParityTests::test_streaming_foreach_batch
pyspark/sql/tests/connect/streaming/test_parity_foreach_batch.py::StreamingForeachBatchParityTests::test_streaming_foreach_batch_tempview
pyspark/sql/tests/connect/streaming/test_parity_listener.py::StreamingListenerParityTests::test_listener_events
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_query_manager_await_termination
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_query_manager_get
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_await_termination
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_exception
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_read_options
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_read_options_overwrite
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_save_options
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_save_options_overwrite
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_status_and_progress
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_streaming_query_functions_basic
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_streaming_read_from_table
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_streaming_write_to_table
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_collect
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_collect_nested_type
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_collect_timestamp
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_column_regexp
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_create_global_temp_view
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_deduplicate_within_watermark_in_batch
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_describe
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_explain_string
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_extended_hint_types
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_grouped_data
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_hint
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_input_files
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_invalid_column
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_is_local
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_join_hint
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_json
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_multi_paths
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_numeric_aggregation
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_observe
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_orc
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_random_split
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_repartition_by_expression
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_repartition_by_range
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_replace
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_same_semantics
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_schema
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_semantic_hash
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_simple_datasource_read
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_simple_read_without_schema
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_simple_udt_from_read
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_sql_with_command
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_sql_with_pos_args
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_stat_approx_quantile
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_stat_freq_items
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_stat_sample_by
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_streaming_local_relation
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_tail
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_text
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_to
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_unpivot
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_with_local_list
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_with_local_ndarray
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_write_operations
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectSessionTests::test_error_stack_trace
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_cast
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_column_accessor
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_column_arithmetic_ops
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_column_field_ops
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_columns
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_decimal
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_distributed_sequence_id
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_aggregation_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_broadcast
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_call_udf
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_collection_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_csv_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_date_ts_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_generator_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_json_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_lambda_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_map_collection_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_math_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_misc_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_nested_lambda_function
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_normal_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_string_functions_multi_args
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_string_functions_one_arg
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_time_window_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_udf
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_udtf
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_when_otherwise
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_window_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_window_order
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_createDataFrame_duplicate_field_names
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_negative_and_zero_batch_size
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_pandas_self_destruct
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_toPandas_duplicate_field_names
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_nondeterministic_udf_in_aggregate
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_in_join_condition
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_not_supported_in_join_condition
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_with_input_file_name
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::UDFParityTests::test_nondeterministic_udf_in_aggregate
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_in_join_condition
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_not_supported_in_join_condition
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_with_input_file_name
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_function_exists
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_get_function
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_list_functions
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_list_tables
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_refresh_table
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_table_cache
pyspark/sql/tests/connect/test_parity_column.py::ColumnParityTests::test_bitwise_operations
pyspark/sql/tests/connect/test_parity_column.py::ColumnParityTests::test_drop_fields
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_cache_dataframe
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_cache_table
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_duplicate_field_names
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_extended_hint_types
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_freqItems
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_generic_hints
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_input_files
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_join_without_on
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_require_cross
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_sample
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_sample_with_random_seed
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_to
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_unpivot
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_unpivot_negative
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_checking_csv_header
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_encoding_json
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_ignore_column_of_all_nulls
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_ignorewhitespace_csv
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_jdbc
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_jdbc_format
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_linesep_json
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_linesep_text
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_multiline_csv
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_multiline_json
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_read_multiple_orc_file
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_read_text_file_list
pyspark/sql/tests/connect/test_parity_errors.py::ErrorsParityTests::test_array_index_out_of_bounds_exception
pyspark/sql/tests/connect/test_parity_errors.py::ErrorsParityTests::test_date_time_exception
pyspark/sql/tests/connect/test_parity_errors.py::ErrorsParityTests::test_number_format_exception
pyspark/sql/tests/connect/test_parity_errors.py::ErrorsParityTests::test_spark_runtime_exception
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_approxQuantile
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_assert_true
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_collect_functions
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_functions_broadcast
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_inline
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_input_file_name_udf
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_map_functions
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_nested_higher_order_function
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_np_scalar_input
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_nth_value
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_raise_error
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_reciprocal_trig_functions
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_sampleby
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_shiftrightunsigned
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_window_time
pyspark/sql/tests/connect/test_parity_pandas_grouped_map.py::GroupedApplyInPandasTests::test_grouped_over_window
pyspark/sql/tests/connect/test_parity_pandas_grouped_map.py::GroupedApplyInPandasTests::test_grouped_over_window_with_key
pyspark/sql/tests/connect/test_parity_pandas_grouped_map_with_state.py::GroupedApplyInPandasWithStateTests::test_apply_in_pandas_with_state_python_worker_random_failure
pyspark/sql/tests/connect/test_parity_pandas_map.py::MapInPandasParityTests::test_large_variable_types
pyspark/sql/tests/connect/test_parity_pandas_udf_grouped_agg.py::PandasUDFGroupedAggParityTests::test_invalid_args
pyspark/sql/tests/connect/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_nondeterministic_vectorized_udf_in_aggregate
pyspark/sql/tests/connect/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_scalar_iter_udf_init
pyspark/sql/tests/connect/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_vectorized_udf_check_config
pyspark/sql/tests/connect/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_vectorized_udf_invalid_length
pyspark/sql/tests/connect/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_bounded_mixed
pyspark/sql/tests/connect/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_bounded_simple
pyspark/sql/tests/connect/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_shrinking_window
pyspark/sql/tests/connect/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_sliding_window
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterParityTests::test_bucketed_write
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterParityTests::test_insert_into
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterParityTests::test_save_and_load
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterParityTests::test_save_and_load_builder
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterV2ParityTests::test_create_without_provider
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterV2ParityTests::test_table_overwrite
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_cast_to_string_with_udt
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_cast_to_udt_with_udt
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_complex_nested_udt_in_df
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_negative_decimal
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_parquet_with_udt
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_udf_with_udt
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_udt_with_none
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_yearmonth_interval_type
pyspark/sql/tests/connect/test_parity_udf.py::UDFParityTests::test_nondeterministic_udf_in_aggregate
pyspark/sql/tests/connect/test_parity_udf.py::UDFParityTests::test_udf_in_join_condition
pyspark/sql/tests/connect/test_parity_udf.py::UDFParityTests::test_udf_not_supported_in_join_condition
pyspark/sql/tests/connect/test_parity_udf.py::UDFParityTests::test_udf_with_input_file_name
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_arrow_sql_conf
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_terminate
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_with_table_argument_malformed_query
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_with_table_argument_multiple
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_with_table_argument_unknown_identifier
pyspark/sql/tests/connect/test_parity_udtf.py::UDTFParityTests::test_udtf_terminate
pyspark/sql/tests/connect/test_parity_udtf.py::UDTFParityTests::test_udtf_with_table_argument_malformed_query
pyspark/sql/tests/connect/test_parity_udtf.py::UDTFParityTests::test_udtf_with_table_argument_multiple
pyspark/sql/tests/connect/test_parity_udtf.py::UDTFParityTests::test_udtf_with_table_argument_unknown_identifier
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_approx_equal_decimaltype_custom_rtol_pass
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_data_frame_equal_not_support_streaming
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_equal_approx_pandas_on_spark_df
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_equal_exact_pandas_on_spark_df
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_equal_nested_struct_str_duplicate
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_error_pandas_pyspark_df
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_type_error_pandas_df

Copy link

github-actions bot commented Jun 25, 2025

Gold Data Report

Notes
  1. The tables below show the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) in gold data input processing.
  2. A positive input is a valid test case, while a negative input is a test case that is expected to fail.

Commit Information

Commit Revision Branch
After 2d8d542 refs/pull/535/merge
Before 9ea0549 main

Summary

Commit TP TN FP FN Total
After 1729 198 40 548 2515
Before 1727 198 40 550 2515

Details

Gold Data Metrics
Group File Commit TP TN FP FN Total
spark data_type.json After 43 5 0 0 48
Before 43 5 0 0 48
expression/case.json After 5 0 0 0 5
Before 5 0 0 0 5
expression/cast.json After 4 0 0 0 4
Before 4 0 0 0 4
expression/current.json After 2 0 0 0 2
Before 2 0 0 0 2
expression/date.json After 4 0 1 0 5
Before 4 0 1 0 5
expression/interval.json After 346 4 1 0 351
Before 346 4 1 0 351
expression/large.json After 2 0 0 0 2
Before 2 0 0 0 2
expression/like.json After 29 10 0 0 39
Before 29 10 0 0 39
expression/misc.json After 109 5 0 1 115
Before 109 5 0 1 115
expression/numeric.json After 31 6 1 0 38
Before 31 6 1 0 38
expression/string.json After 18 1 0 0 19
Before 18 1 0 0 19
expression/timestamp.json After 7 0 3 0 10
Before 7 0 3 0 10
expression/window.json After 73 0 1 0 74
Before 73 0 1 0 74
function/agg.json After 116 0 0 53 169
Before 116 0 0 53 169
function/array.json After 37 0 0 7 44
Before 37 0 0 7 44
function/bitwise.json After 7 0 0 8 15
Before 5 0 0 10 15
function/collection.json After 12 0 0 0 12
Before 12 0 0 0 12
function/conditional.json After 11 0 0 4 15
Before 11 0 0 4 15
function/conversion.json After 2 0 0 0 2
Before 2 0 0 0 2
function/csv.json After 0 0 0 5 5
Before 0 0 0 5 5
function/datetime.json After 102 0 0 43 145
Before 102 0 0 43 145
function/generator.json After 4 0 0 9 13
Before 4 0 0 9 13
function/hash.json After 5 0 0 2 7
Before 5 0 0 2 7
function/json.json After 4 0 0 16 20
Before 4 0 0 16 20
function/lambda.json After 0 0 0 31 31
Before 0 0 0 31 31
function/map.json After 6 0 0 5 11
Before 6 0 0 5 11
function/math.json After 84 0 0 40 124
Before 84 0 0 40 124
function/misc.json After 21 0 0 32 53
Before 21 0 0 32 53
function/predicate.json After 70 0 0 9 79
Before 70 0 0 9 79
function/string.json After 123 0 0 81 204
Before 123 0 0 81 204
function/struct.json After 2 0 0 0 2
Before 2 0 0 0 2
function/url.json After 0 0 0 10 10
Before 0 0 0 10 10
function/variant.json After 0 0 0 28 28
Before 0 0 0 28 28
function/window.json After 6 0 0 3 9
Before 6 0 0 3 9
function/xml.json After 0 0 0 17 17
Before 0 0 0 17 17
plan/ddl_alter_table.json After 49 14 3 11 77
Before 49 14 3 11 77
plan/ddl_alter_view.json After 5 1 0 0 6
Before 5 1 0 0 6
plan/ddl_analyze_table.json After 17 6 0 0 23
Before 17 6 0 0 23
plan/ddl_cache.json After 4 0 1 0 5
Before 4 0 1 0 5
plan/ddl_create_index.json After 0 0 0 3 3
Before 0 0 0 3 3
plan/ddl_create_table.json After 23 32 6 44 105
Before 23 32 6 44 105
plan/ddl_delete_from.json After 2 1 0 0 3
Before 2 1 0 0 3
plan/ddl_describe.json After 4 0 0 0 4
Before 4 0 0 0 4
plan/ddl_drop_index.json After 0 0 0 2 2
Before 0 0 0 2 2
plan/ddl_drop_view.json After 5 0 0 0 5
Before 5 0 0 0 5
plan/ddl_insert_into.json After 16 1 1 0 18
Before 16 1 1 0 18
plan/ddl_insert_overwrite.json After 9 0 2 0 11
Before 9 0 2 0 11
plan/ddl_load_data.json After 4 0 0 0 4
Before 4 0 0 0 4
plan/ddl_merge_into.json After 8 4 3 0 15
Before 8 4 3 0 15
plan/ddl_misc.json After 9 0 0 1 10
Before 9 0 0 1 10
plan/ddl_replace_table.json After 19 16 5 44 84
Before 19 16 5 44 84
plan/ddl_select.json After 1 0 0 0 1
Before 1 0 0 0 1
plan/ddl_show_views.json After 7 0 0 0 7
Before 7 0 0 0 7
plan/ddl_uncache.json After 2 0 0 0 2
Before 2 0 0 0 2
plan/ddl_update.json After 2 1 0 0 3
Before 2 1 0 0 3
plan/error_alter_table.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/error_analyze_table.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/error_create_table.json After 0 6 0 0 6
Before 0 6 0 0 6
plan/error_describe.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/error_join.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/error_load_data.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/error_misc.json After 0 14 0 0 14
Before 0 14 0 0 14
plan/error_order_by.json After 1 4 0 0 5
Before 1 4 0 0 5
plan/error_select.json After 0 15 0 0 15
Before 0 15 0 0 15
plan/error_with.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/plan_alter_view.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/plan_create_view.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/plan_explain.json After 0 1 1 0 2
Before 0 1 1 0 2
plan/plan_group_by.json After 9 1 0 1 11
Before 9 1 0 1 11
plan/plan_hint.json After 25 0 3 0 28
Before 25 0 3 0 28
plan/plan_insert_into.json After 3 0 0 0 3
Before 3 0 0 0 3
plan/plan_insert_overwrite.json After 2 0 0 0 2
Before 2 0 0 0 2
plan/plan_join.json After 53 2 1 6 62
Before 53 2 1 6 62
plan/plan_misc.json After 15 4 0 10 29
Before 15 4 0 10 29
plan/plan_order_by.json After 15 5 1 10 31
Before 15 5 1 10 31
plan/plan_select.json After 86 15 5 12 118
Before 86 15 5 12 118
plan/plan_set_operation.json After 17 0 0 0 17
Before 17 0 0 0 17
plan/plan_with.json After 6 0 1 0 7
Before 6 0 1 0 7
plan/unpivot_join.json After 4 0 0 0 4
Before 4 0 0 0 4
plan/unpivot_select.json After 14 6 0 0 20
Before 14 6 0 0 20
table_schema.json After 8 6 0 0 14
Before 8 6 0 0 14

Copy link

github-actions bot commented Jun 25, 2025

Spark 3.5.5 Test Report

Commit Information

Commit Revision Branch
After 2d8d542 refs/pull/535/merge
Before 9ea0549 refs/heads/main

Test Summary

Suite Commit Failed Passed Skipped Warnings Time (s)
doctest-catalog After 13 12 6 4.87
Before 13 12 6 5.47
doctest-column After 33 2 5.46
Before 33 2 5.38
doctest-dataframe After 27 79 1 3 8.04
Before 27 79 1 3 8.44
doctest-functions After 136 266 7 7 13.93
Before 136 266 7 7 14.46
test-connect After 228 810 133 629 135.24
Before 228 810 133 629 138.68

Test Details

Error Counts
          404 Total
(+1)      211 Total Unique
-------- ---- ----------------------------------------------------------------------------------------------------------
           25 DocTestFailure
           14 UnsupportedOperationException: lambda function
           14 UnsupportedOperationException: streaming query manager command
           13 AssertionError: AnalysisException not raised
           11 PySparkAssertionError: [DIFFERENT_PANDAS_DATAFRAME] DataFrames are not almost equal:
           10 UnsupportedOperationException: unsupported data source format: "text"
           10 handle add artifacts
            8 AssertionError: False is not true
            8 UnsupportedOperationException: hint
            6 AnalysisException: Cannot cast to Decimal128(14, 7). Overflowing on NaN
            6 UnsupportedOperationException: function: window
            6 UnsupportedOperationException: write stream operation start
            5 UnsupportedOperationException: function: monotonically_increasing_id
            4 AssertionError: "TABLE_OR_VIEW_NOT_FOUND" does not match "No table named 'v'"
            4 AssertionError: Attributes of DataFrame.iloc[:, 7] (column name="8_timestamp_t") are different
            4 PySparkNotImplementedError: [NOT_IMPLEMENTED] rdd() is not implemented.
            4 UnsupportedOperationException: sample by
            4 UnsupportedOperationException: unknown aggregate function: hll_sketch_agg
            4 UnsupportedOperationException: unpivot
            3 AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="time") are different
            3 IllegalArgumentException: invalid argument: empty data source paths
(+3)        3 IllegalArgumentException: invalid argument: unsupported file format: TEXT
            3 UnsupportedOperationException: PlanNode::CacheTable
            3 UnsupportedOperationException: function: input_file_name
            3 UnsupportedOperationException: function: ~
            3 UnsupportedOperationException: handle analyze input files
            3 ValueError: Converting to Python dictionary is not supported when duplicate field names are present
            2 AnalysisException: Could not find config namespace "spark"
            2 AnalysisException: map requires all value types to be the same
            2 AnalysisException: two values expected: [Column(Column { relation: None, name: "#2" }), Column(Colum...
            2 AssertionError
            2 AssertionError: AnalysisException not raised by <lambda>
            2 AssertionError: Lists differ: [Row([22 chars](key=1, value='1'), Row(key=10, value='10'), R[2402 cha...
            2 IllegalArgumentException: expected value at line 1 column 1
            2 IllegalArgumentException: invalid argument: found FUNCTION at 5:13 expected 'DATABASE', 'SCHEMA', 'T...
            2 PythonException:  ZeroDivisionError: division by zero
            2 SparkRuntimeException: start_from index out of bounds
            2 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
(+2)        2 UnsupportedOperationException: Data Source option 'line_sep' is not supported yet.
(+2)        2 UnsupportedOperationException: Data Source option 'multi_line' is not supported yet.
            2 UnsupportedOperationException: approx quantile
            2 UnsupportedOperationException: collect metrics
            2 UnsupportedOperationException: freq items
            2 UnsupportedOperationException: function: bitmap_bit_position
            2 UnsupportedOperationException: function: format_number
            2 UnsupportedOperationException: function: from_csv
            2 UnsupportedOperationException: function: from_json
            2 UnsupportedOperationException: function: inline
            2 UnsupportedOperationException: function: map_entries
            2 UnsupportedOperationException: function: sec
            2 UnsupportedOperationException: function: shiftrightunsigned
            2 UnsupportedOperationException: handle analyze is local
            2 UnsupportedOperationException: handle analyze same semantics
            2 UnsupportedOperationException: list functions
            2 UnsupportedOperationException: pivot
            2 UnsupportedOperationException: position with 3 arguments is not supported yet
            2 UnsupportedOperationException: rebalance partitioning by expression
            2 UnsupportedOperationException: unknown aggregate function: collect_set
            2 UnsupportedOperationException: unresolved regex
            2 UnsupportedOperationException: unsupported data source format: "orc"
            2 UnsupportedOperationException: user defined data type should only exist in a field
            2 handle artifact statuses
            2 received metadata size exceeds hard limit (19714 vs. 16384);  :status:42B content-type:60B grpc-stat...
            1 AnalysisException: Cannot cast string 'abc' to value of Float64 type
            1 AnalysisException: Cannot cast value 'abc' to value of Boolean type
            1 AnalysisException: Cannot infer common argument type for comparison operation Boolean = Float64
            1 AnalysisException: Error parsing timestamp from '2023-01-01' using format '%d-%m-%Y': input contains...
            1 AnalysisException: Failed to coerce arguments to satisfy a call to 'nth_value' function: coercion fr...
            1 AnalysisException: Failed to parse placeholder id: cannot parse integer from empty string
            1 AnalysisException: Inconsistent data type across values list at row 1 column 1. Was Map(Field { name...
            1 AnalysisException: Table 'tbl1' already exists
            1 AnalysisException: UNION queries have different number of columns: left has 2 columns whereas right ...
            1 AnalysisException: view not found: tab2
            1 AssertionError: "2000000" does not match "raise_error expects a single UTF-8 string argument"
(+1)        1 AssertionError: "CSV header does not conform to the schema" does not match "Data Source option 'enfo...
(+1)        1 AssertionError: "Database 'memory:106305b4-388b-42dd-a4c6-75ba44da2b4e' dropped." does not match "in...
(+1)        1 AssertionError: "Database 'memory:50ef719b-d0d9-4652-971c-7514e394a7f1' dropped." does not match "in...
            1 AssertionError: "TABLE_OR_VIEW_NOT_FOUND" does not match "The table test_table already exists"
            1 AssertionError: "attribute.*missing" does not match "cannot resolve attribute: ObjectName([Identifie...
            1 AssertionError: "foobar" does not match "raise_error expects a single UTF-8 string argument"
            1 AssertionError: '+---[17 chars]-----+\n|                        x|\n+--------[132 chars]-+\n' != '+-...
            1 AssertionError: '4.0.0' != '3.5.5'
            1 AssertionError: ArrayIndexOutOfBoundsException not raised
            1 AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="a") are different
            1 AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="ts") are different
            1 AssertionError: Exception not raised
            1 AssertionError: Lists differ: [Row([14 chars] _c1=25, _c2='I am Hyukjin\n\nI love Spark!'),[86 chars...
            1 AssertionError: Lists differ: [Row(id=90, name='90'), Row(id=91, name='91'), Ro[176 chars]99')] != [...
            1 AssertionError: Lists differ: [Row(key='0'), Row(key='1'), Row(key='10'), Row(ke[1435 chars]99')] !=...
            1 AssertionError: Lists differ: [Row(ln(id)=0.0, ln(id)=0.0, struct(id, name)=Row(id=[1232 chars]0'))]...
            1 AssertionError: Row(point='[1.0, 2.0]', pypoint='[3.0, 4.0]') != Row(point='(1.0, 2.0)', pypoint='[3...
            1 AssertionError: StorageLevel(False, True, True, False, 1) != StorageLevel(False, False, False, False...
            1 AssertionError: Struc[30 chars]estampType(), True), StructField('val', IntegerType(), True)]) != Str...
            1 AssertionError: Struc[32 chars]e(), False), StructField('b', DoubleType(), Fa[158 chars]ue)]) != Str...
            1 AssertionError: Struc[40 chars]ue), StructField('val', ArrayType(DoubleType(), False), True)]) != St...
            1 AssertionError: Struc[64 chars]Type(), True), StructField('i', StringType(), True)]), False)]) != St...
            1 AssertionError: Struc[69 chars]e(), True), StructField('name', StringType(), True)]), True)]) != Str...
            1 AssertionError: YearMonthIntervalType(0, 1) != YearMonthIntervalType(0, 0)
            1 AssertionError: [1.0, 2.0] != ExamplePoint(1.0,2.0)
            1 AssertionError: dtype('<M8[us]') != 'datetime64[ns]'
            1 AttributeError: 'DataFrame' object has no attribute '_ipython_key_completions_'
            1 AttributeError: 'DataFrame' object has no attribute '_joinAsOf'
(+1)        1 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpvfo8zl2s'
            1 IllegalArgumentException: 83140 is too large to store in a Decimal128 of precision 4. Max is 9999
            1 IllegalArgumentException: column types must match schema types, expected Int64 but found List(Field ...
            1 IllegalArgumentException: column types must match schema types, expected LargeUtf8 but found Utf8 at...
            1 IllegalArgumentException: invalid argument: found FUNCTION at 7:15 expected 'DATABASE', 'SCHEMA', 'O...
            1 IllegalArgumentException: invalid argument: invalid digit found in string
(+1)        1 IllegalArgumentException: invalid argument: missing source
            1 ParseException: Error parsing timestamp from '1997/02/28 10:30:00': error parsing date
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] foreach() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] foreachPartition() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] localCheckpoint() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] sparkContext() is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] toJSON() is not implemented.
            1 PythonException:  AttributeError: 'NoneType' object has no attribute 'partitionId'
            1 PythonException:  AttributeError: 'list' object has no attribute 'x'
            1 PythonException:  AttributeError: 'list' object has no attribute 'y'
            1 SparkRuntimeException: Failed due to a difference in schemas: original schema: DFSchema { inner: Sch...
            1 UnknownTimeZoneError: 'PST'
            1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
            1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
            1 UnsupportedOperationException: COUNT DISTINCT with multiple arguments
(+1)        1 UnsupportedOperationException: Data Source option 'ignore_leading_white_space' is not supported yet.
(+1)        1 UnsupportedOperationException: Data Source option 'primitives_as_string' is not supported yet.
            1 UnsupportedOperationException: Insert into not implemented for this table
            1 UnsupportedOperationException: PlanNode::ClearCache
            1 UnsupportedOperationException: PlanNode::IsCached
            1 UnsupportedOperationException: SHOW FUNCTIONS
            1 UnsupportedOperationException: bucketing
            1 UnsupportedOperationException: deduplicate within watermark
            1 UnsupportedOperationException: function exists
            1 UnsupportedOperationException: function: array_insert
            1 UnsupportedOperationException: function: array_sort
            1 UnsupportedOperationException: function: arrays_zip
            1 UnsupportedOperationException: function: bit_count
            1 UnsupportedOperationException: function: bit_get
            1 UnsupportedOperationException: function: bitmap_bucket_number
            1 UnsupportedOperationException: function: bitmap_count
            1 UnsupportedOperationException: function: bround
            1 UnsupportedOperationException: function: conv
            1 UnsupportedOperationException: function: convert_timezone
            1 UnsupportedOperationException: function: csc
            1 UnsupportedOperationException: function: elt
            1 UnsupportedOperationException: function: format_string
            1 UnsupportedOperationException: function: getbit
            1 UnsupportedOperationException: function: inline_outer
            1 UnsupportedOperationException: function: java_method
            1 UnsupportedOperationException: function: json_object_keys
            1 UnsupportedOperationException: function: json_tuple
            1 UnsupportedOperationException: function: make_dt_interval
            1 UnsupportedOperationException: function: make_interval
            1 UnsupportedOperationException: function: make_timestamp_ltz
            1 UnsupportedOperationException: function: map_concat
            1 UnsupportedOperationException: function: map_from_entries
            1 UnsupportedOperationException: function: months_between
            1 UnsupportedOperationException: function: parse_url
            1 UnsupportedOperationException: function: printf
            1 UnsupportedOperationException: function: reflect
            1 UnsupportedOperationException: function: regexp_extract
            1 UnsupportedOperationException: function: regexp_extract_all
            1 UnsupportedOperationException: function: regexp_instr
            1 UnsupportedOperationException: function: regexp_substr
            1 UnsupportedOperationException: function: schema_of_csv
            1 UnsupportedOperationException: function: schema_of_json
            1 UnsupportedOperationException: function: sentences
            1 UnsupportedOperationException: function: session_window
            1 UnsupportedOperationException: function: soundex
            1 UnsupportedOperationException: function: spark_partition_id
            1 UnsupportedOperationException: function: split
            1 UnsupportedOperationException: function: stack
            1 UnsupportedOperationException: function: str_to_map
            1 UnsupportedOperationException: function: to_char
            1 UnsupportedOperationException: function: to_csv
            1 UnsupportedOperationException: function: to_json
            1 UnsupportedOperationException: function: to_number
            1 UnsupportedOperationException: function: to_unix_timestamp
            1 UnsupportedOperationException: function: to_utc_timestamp
            1 UnsupportedOperationException: function: to_varchar
            1 UnsupportedOperationException: function: try_add
            1 UnsupportedOperationException: function: try_divide
            1 UnsupportedOperationException: function: try_multiply
            1 UnsupportedOperationException: function: try_subtract
            1 UnsupportedOperationException: function: try_to_number
            1 UnsupportedOperationException: function: url_decode
            1 UnsupportedOperationException: function: url_encode
            1 UnsupportedOperationException: function: width_bucket
            1 UnsupportedOperationException: function: xpath
            1 UnsupportedOperationException: function: xpath_boolean
            1 UnsupportedOperationException: function: xpath_double
            1 UnsupportedOperationException: function: xpath_float
            1 UnsupportedOperationException: function: xpath_int
            1 UnsupportedOperationException: function: xpath_long
            1 UnsupportedOperationException: function: xpath_number
            1 UnsupportedOperationException: function: xpath_short
            1 UnsupportedOperationException: function: xpath_string
            1 UnsupportedOperationException: handle analyze semantic hash
            1 UnsupportedOperationException: make_timestamp with timezone is not yet implemented
            1 UnsupportedOperationException: unknown aggregate function: bitmap_or_agg
            1 UnsupportedOperationException: unknown aggregate function: count_if
            1 UnsupportedOperationException: unknown aggregate function: count_min_sketch
            1 UnsupportedOperationException: unknown aggregate function: grouping_id
            1 UnsupportedOperationException: unknown aggregate function: histogram_numeric
            1 UnsupportedOperationException: unknown aggregate function: percentile
            1 UnsupportedOperationException: unknown aggregate function: try_avg
            1 UnsupportedOperationException: unknown aggregate function: try_sum
            1 UnsupportedOperationException: unknown function: distributed_sequence_id
            1 UnsupportedOperationException: unknown function: product
            1 ValueError: Code in Status proto (StatusCode.INTERNAL) doesn't match status code (StatusCode.RESOURC...
            1 ValueError: The column label 'id' is not unique.
            1 ValueError: The column label 'struct' is not unique.
(-3)        0 AnalysisException: Unable to find factory for TEXT
(-1)        0 AssertionError: "Database 'memory:3a24bcf1-6131-41a2-9404-aa44752be2a5' dropped." does not match "in...
(-1)        0 AssertionError: "Database 'memory:4fd55fab-d897-4db3-9b59-2cc353cfbffa' dropped." does not match "in...
(-1)        0 AssertionError: Exception not raised by <lambda>
(-1)        0 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpyphj6mi2'
(-1)        0 ParseException: Error while parsing value 'id' as type 'Int32' for column 0 at line 0. Row data: '[i...
(-4)        0 UnsupportedOperationException: JSON data source read options are not supported yet
(-1)        0 UnsupportedOperationException: data writer option: ignoretrailingwhitespace
(-1)        0 UnsupportedOperationException: partitioning columns
Passed Tests Diff

(empty)

Failed Tests
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.cacheTable
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.clearCache
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.createTable
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.functionExists
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.getDatabase
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.getFunction
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.isCached
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.listDatabases
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.listFunctions
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.recoverPartitions
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.refreshByPath
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.refreshTable
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.uncacheTable
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame._ipython_key_completions_
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame._joinAsOf
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.checkpoint
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.coalesce
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.colRegex
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.dropDuplicatesWithinWatermark
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.explain
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.foreach
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.foreachPartition
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.hint
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.inputFiles
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.isLocal
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.isStreaming
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.localCheckpoint
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.observe
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.randomSplit
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.rdd
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.repartition
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.repartitionByRange
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.sameSemantics
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.sampleBy
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.storageLevel
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.toJSON
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.unpivot
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.withWatermark
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.writeStream
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrameStatFunctions.sampleBy
pyspark/sql/functions.py::pyspark.sql.functions.aggregate
pyspark/sql/functions.py::pyspark.sql.functions.approx_percentile
pyspark/sql/functions.py::pyspark.sql.functions.array_insert
pyspark/sql/functions.py::pyspark.sql.functions.array_position
pyspark/sql/functions.py::pyspark.sql.functions.array_sort
pyspark/sql/functions.py::pyspark.sql.functions.array_union
pyspark/sql/functions.py::pyspark.sql.functions.arrays_zip
pyspark/sql/functions.py::pyspark.sql.functions.bit_count
pyspark/sql/functions.py::pyspark.sql.functions.bit_get
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_bit_position
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_bucket_number
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_construct_agg
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_count
pyspark/sql/functions.py::pyspark.sql.functions.bitmap_or_agg
pyspark/sql/functions.py::pyspark.sql.functions.bitwise_not
pyspark/sql/functions.py::pyspark.sql.functions.broadcast
pyspark/sql/functions.py::pyspark.sql.functions.bround
pyspark/sql/functions.py::pyspark.sql.functions.collect_set
pyspark/sql/functions.py::pyspark.sql.functions.concat
pyspark/sql/functions.py::pyspark.sql.functions.conv
pyspark/sql/functions.py::pyspark.sql.functions.convert_timezone
pyspark/sql/functions.py::pyspark.sql.functions.count_distinct
pyspark/sql/functions.py::pyspark.sql.functions.count_if
pyspark/sql/functions.py::pyspark.sql.functions.count_min_sketch
pyspark/sql/functions.py::pyspark.sql.functions.csc
pyspark/sql/functions.py::pyspark.sql.functions.date_part
pyspark/sql/functions.py::pyspark.sql.functions.datepart
pyspark/sql/functions.py::pyspark.sql.functions.elt
pyspark/sql/functions.py::pyspark.sql.functions.exists
pyspark/sql/functions.py::pyspark.sql.functions.extract
pyspark/sql/functions.py::pyspark.sql.functions.filter
pyspark/sql/functions.py::pyspark.sql.functions.first
pyspark/sql/functions.py::pyspark.sql.functions.flatten
pyspark/sql/functions.py::pyspark.sql.functions.forall
pyspark/sql/functions.py::pyspark.sql.functions.format_number
pyspark/sql/functions.py::pyspark.sql.functions.format_string
pyspark/sql/functions.py::pyspark.sql.functions.from_csv
pyspark/sql/functions.py::pyspark.sql.functions.from_json
pyspark/sql/functions.py::pyspark.sql.functions.from_utc_timestamp
pyspark/sql/functions.py::pyspark.sql.functions.getbit
pyspark/sql/functions.py::pyspark.sql.functions.grouping_id
pyspark/sql/functions.py::pyspark.sql.functions.histogram_numeric
pyspark/sql/functions.py::pyspark.sql.functions.hll_sketch_agg
pyspark/sql/functions.py::pyspark.sql.functions.hll_sketch_estimate
pyspark/sql/functions.py::pyspark.sql.functions.hll_union
pyspark/sql/functions.py::pyspark.sql.functions.hll_union_agg
pyspark/sql/functions.py::pyspark.sql.functions.ilike
pyspark/sql/functions.py::pyspark.sql.functions.inline
pyspark/sql/functions.py::pyspark.sql.functions.inline_outer
pyspark/sql/functions.py::pyspark.sql.functions.input_file_block_length
pyspark/sql/functions.py::pyspark.sql.functions.input_file_block_start
pyspark/sql/functions.py::pyspark.sql.functions.input_file_name
pyspark/sql/functions.py::pyspark.sql.functions.java_method
pyspark/sql/functions.py::pyspark.sql.functions.json_object_keys
pyspark/sql/functions.py::pyspark.sql.functions.json_tuple
pyspark/sql/functions.py::pyspark.sql.functions.kurtosis
pyspark/sql/functions.py::pyspark.sql.functions.last
pyspark/sql/functions.py::pyspark.sql.functions.like
pyspark/sql/functions.py::pyspark.sql.functions.locate
pyspark/sql/functions.py::pyspark.sql.functions.make_dt_interval
pyspark/sql/functions.py::pyspark.sql.functions.make_interval
pyspark/sql/functions.py::pyspark.sql.functions.make_timestamp
pyspark/sql/functions.py::pyspark.sql.functions.make_timestamp_ltz
pyspark/sql/functions.py::pyspark.sql.functions.map_concat
pyspark/sql/functions.py::pyspark.sql.functions.map_entries
pyspark/sql/functions.py::pyspark.sql.functions.map_filter
pyspark/sql/functions.py::pyspark.sql.functions.map_from_entries
pyspark/sql/functions.py::pyspark.sql.functions.map_zip_with
pyspark/sql/functions.py::pyspark.sql.functions.median
pyspark/sql/functions.py::pyspark.sql.functions.mode
pyspark/sql/functions.py::pyspark.sql.functions.monotonically_increasing_id
pyspark/sql/functions.py::pyspark.sql.functions.months_between
pyspark/sql/functions.py::pyspark.sql.functions.parse_url
pyspark/sql/functions.py::pyspark.sql.functions.percentile
pyspark/sql/functions.py::pyspark.sql.functions.percentile_approx
pyspark/sql/functions.py::pyspark.sql.functions.position
pyspark/sql/functions.py::pyspark.sql.functions.printf
pyspark/sql/functions.py::pyspark.sql.functions.product
pyspark/sql/functions.py::pyspark.sql.functions.rand
pyspark/sql/functions.py::pyspark.sql.functions.randn
pyspark/sql/functions.py::pyspark.sql.functions.reduce
pyspark/sql/functions.py::pyspark.sql.functions.reflect
pyspark/sql/functions.py::pyspark.sql.functions.regexp_extract
pyspark/sql/functions.py::pyspark.sql.functions.regexp_extract_all
pyspark/sql/functions.py::pyspark.sql.functions.regexp_instr
pyspark/sql/functions.py::pyspark.sql.functions.regexp_substr
pyspark/sql/functions.py::pyspark.sql.functions.regr_avgy
pyspark/sql/functions.py::pyspark.sql.functions.regr_intercept
pyspark/sql/functions.py::pyspark.sql.functions.regr_r2
pyspark/sql/functions.py::pyspark.sql.functions.regr_slope
pyspark/sql/functions.py::pyspark.sql.functions.regr_sxy
pyspark/sql/functions.py::pyspark.sql.functions.regr_syy
pyspark/sql/functions.py::pyspark.sql.functions.schema_of_csv
pyspark/sql/functions.py::pyspark.sql.functions.schema_of_json
pyspark/sql/functions.py::pyspark.sql.functions.sec
pyspark/sql/functions.py::pyspark.sql.functions.sentences
pyspark/sql/functions.py::pyspark.sql.functions.session_window
pyspark/sql/functions.py::pyspark.sql.functions.shiftrightunsigned
pyspark/sql/functions.py::pyspark.sql.functions.skewness
pyspark/sql/functions.py::pyspark.sql.functions.soundex
pyspark/sql/functions.py::pyspark.sql.functions.spark_partition_id
pyspark/sql/functions.py::pyspark.sql.functions.split
pyspark/sql/functions.py::pyspark.sql.functions.stack
pyspark/sql/functions.py::pyspark.sql.functions.str_to_map
pyspark/sql/functions.py::pyspark.sql.functions.to_char
pyspark/sql/functions.py::pyspark.sql.functions.to_csv
pyspark/sql/functions.py::pyspark.sql.functions.to_json
pyspark/sql/functions.py::pyspark.sql.functions.to_number
pyspark/sql/functions.py::pyspark.sql.functions.to_unix_timestamp
pyspark/sql/functions.py::pyspark.sql.functions.to_utc_timestamp
pyspark/sql/functions.py::pyspark.sql.functions.to_varchar
pyspark/sql/functions.py::pyspark.sql.functions.transform
pyspark/sql/functions.py::pyspark.sql.functions.transform_keys
pyspark/sql/functions.py::pyspark.sql.functions.transform_values
pyspark/sql/functions.py::pyspark.sql.functions.try_add
pyspark/sql/functions.py::pyspark.sql.functions.try_avg
pyspark/sql/functions.py::pyspark.sql.functions.try_divide
pyspark/sql/functions.py::pyspark.sql.functions.try_multiply
pyspark/sql/functions.py::pyspark.sql.functions.try_subtract
pyspark/sql/functions.py::pyspark.sql.functions.try_sum
pyspark/sql/functions.py::pyspark.sql.functions.try_to_number
pyspark/sql/functions.py::pyspark.sql.functions.url_decode
pyspark/sql/functions.py::pyspark.sql.functions.url_encode
pyspark/sql/functions.py::pyspark.sql.functions.width_bucket
pyspark/sql/functions.py::pyspark.sql.functions.window
pyspark/sql/functions.py::pyspark.sql.functions.window_time
pyspark/sql/functions.py::pyspark.sql.functions.xpath
pyspark/sql/functions.py::pyspark.sql.functions.xpath_boolean
pyspark/sql/functions.py::pyspark.sql.functions.xpath_double
pyspark/sql/functions.py::pyspark.sql.functions.xpath_float
pyspark/sql/functions.py::pyspark.sql.functions.xpath_int
pyspark/sql/functions.py::pyspark.sql.functions.xpath_long
pyspark/sql/functions.py::pyspark.sql.functions.xpath_number
pyspark/sql/functions.py::pyspark.sql.functions.xpath_short
pyspark/sql/functions.py::pyspark.sql.functions.xpath_string
pyspark/sql/functions.py::pyspark.sql.functions.zip_with
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_archive
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_file
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_pyfile
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_zipped_package
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_basic_requests
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_cache_artifact
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_copy_from_local_to_fs
pyspark/sql/tests/connect/client/test_artifact.py::LocalClusterArtifactTests::test_add_archive
pyspark/sql/tests/connect/client/test_artifact.py::LocalClusterArtifactTests::test_add_file
pyspark/sql/tests/connect/client/test_artifact.py::LocalClusterArtifactTests::test_add_pyfile
pyspark/sql/tests/connect/client/test_artifact.py::LocalClusterArtifactTests::test_add_zipped_package
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_basic_open_process_close
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_invalid_writers
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_open_returning_false
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_process_throwing_error
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_with_simple_function
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_without_close_method
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_without_open_and_close_methods
pyspark/sql/tests/connect/streaming/test_parity_foreach.py::StreamingForeachParityTests::test_streaming_foreach_without_open_method
pyspark/sql/tests/connect/streaming/test_parity_foreach_batch.py::StreamingForeachBatchParityTests::test_streaming_foreach_batch
pyspark/sql/tests/connect/streaming/test_parity_foreach_batch.py::StreamingForeachBatchParityTests::test_streaming_foreach_batch_tempview
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_query_manager_await_termination
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_query_manager_get
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_await_termination
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_exception
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_read_options
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_read_options_overwrite
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_save_options
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_save_options_overwrite
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_stream_status_and_progress
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_streaming_query_functions_basic
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_streaming_read_from_table
pyspark/sql/tests/connect/streaming/test_parity_streaming.py::StreamingParityTests::test_streaming_write_to_table
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_collect
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_collect_nested_type
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_collect_timestamp
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_column_regexp
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_create_global_temp_view
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_deduplicate_within_watermark_in_batch
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_describe
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_explain_string
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_extended_hint_types
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_grouped_data
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_hint
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_input_files
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_invalid_column
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_is_local
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_join_hint
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_json
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_multi_paths
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_namedargs_with_global_limit
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_numeric_aggregation
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_observe
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_orc
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_random_split
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_repartition_by_expression
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_repartition_by_range
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_replace
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_same_semantics
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_schema
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_semantic_hash
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_simple_datasource_read
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_simple_read_without_schema
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_simple_udt_from_read
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_sql_with_command
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_sql_with_pos_args
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_stat_approx_quantile
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_stat_freq_items
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_stat_sample_by
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_streaming_local_relation
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_tail
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_text
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_to
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_unpivot
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_version
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_with_local_list
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_with_local_ndarray
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectBasicTests::test_write_operations
pyspark/sql/tests/connect/test_connect_basic.py::SparkConnectSessionTests::test_error_stack_trace
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_cast
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_column_accessor
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_column_arithmetic_ops
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_column_field_ops
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_columns
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_decimal
pyspark/sql/tests/connect/test_connect_column.py::SparkConnectColumnTests::test_distributed_sequence_id
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_aggregation_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_broadcast
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_call_udf
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_collection_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_csv_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_date_ts_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_generator_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_json_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_lambda_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_map_collection_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_math_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_misc_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_nested_lambda_function
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_normal_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_string_functions_multi_args
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_string_functions_one_arg
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_time_window_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_udf
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_udtf
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_when_otherwise
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_window_functions
pyspark/sql/tests/connect/test_connect_function.py::SparkConnectFunctionTests::test_window_order
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_createDataFrame_duplicate_field_names
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_createDataFrame_with_schema
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_negative_and_zero_batch_size
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_pandas_round_trip
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_pandas_self_destruct
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_timestamp_dst
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_timestamp_nat
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_toPandas_arrow_toggle
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_toPandas_duplicate_field_names
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_toPandas_nested_timestamp
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_toPandas_respect_session_timezone
pyspark/sql/tests/connect/test_parity_arrow.py::ArrowParityTests::test_toPandas_timestmap_tzinfo
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_nondeterministic_udf_in_aggregate
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_in_join_condition
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_not_supported_in_join_condition
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_with_input_file_name
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::UDFParityTests::test_nondeterministic_udf_in_aggregate
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_in_join_condition
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_not_supported_in_join_condition
pyspark/sql/tests/connect/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_with_input_file_name
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_function_exists
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_get_function
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_list_functions
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_list_tables
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_refresh_table
pyspark/sql/tests/connect/test_parity_catalog.py::CatalogParityTests::test_table_cache
pyspark/sql/tests/connect/test_parity_column.py::ColumnParityTests::test_bitwise_operations
pyspark/sql/tests/connect/test_parity_column.py::ColumnParityTests::test_drop_fields
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_cache_dataframe
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_cache_table
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_create_dataframe_from_pandas_with_dst
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_duplicate_field_names
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_extended_hint_types
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_freqItems
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_generic_hints
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_input_files
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_join_without_on
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_require_cross
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_to
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_to_pandas
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_unpivot
pyspark/sql/tests/connect/test_parity_dataframe.py::DataFrameParityTests::test_unpivot_negative
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_checking_csv_header
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_encoding_json
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_ignore_column_of_all_nulls
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_ignorewhitespace_csv
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_jdbc
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_jdbc_format
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_linesep_json
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_linesep_text
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_multiline_csv
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_multiline_json
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_read_multiple_orc_file
pyspark/sql/tests/connect/test_parity_datasources.py::DataSourcesParityTests::test_read_text_file_list
pyspark/sql/tests/connect/test_parity_errors.py::ErrorsParityTests::test_array_index_out_of_bounds_exception
pyspark/sql/tests/connect/test_parity_errors.py::ErrorsParityTests::test_date_time_exception
pyspark/sql/tests/connect/test_parity_errors.py::ErrorsParityTests::test_number_format_exception
pyspark/sql/tests/connect/test_parity_errors.py::ErrorsParityTests::test_spark_runtime_exception
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_approxQuantile
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_assert_true
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_collect_functions
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_functions_broadcast
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_inline
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_input_file_name_udf
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_map_functions
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_nested_higher_order_function
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_np_scalar_input
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_nth_value
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_raise_error
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_reciprocal_trig_functions
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_sampleby
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_shiftrightunsigned
pyspark/sql/tests/connect/test_parity_functions.py::FunctionsParityTests::test_window_time
pyspark/sql/tests/connect/test_parity_pandas_grouped_map.py::GroupedApplyInPandasTests::test_grouped_over_window
pyspark/sql/tests/connect/test_parity_pandas_grouped_map.py::GroupedApplyInPandasTests::test_grouped_over_window_with_key
pyspark/sql/tests/connect/test_parity_pandas_grouped_map_with_state.py::GroupedApplyInPandasWithStateTests::test_apply_in_pandas_with_state_python_worker_random_failure
pyspark/sql/tests/connect/test_parity_pandas_map.py::MapInPandasParityTests::test_large_variable_types
pyspark/sql/tests/connect/test_parity_pandas_udf_grouped_agg.py::PandasUDFGroupedAggParityTests::test_invalid_args
pyspark/sql/tests/connect/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_nondeterministic_vectorized_udf_in_aggregate
pyspark/sql/tests/connect/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_scalar_iter_udf_init
pyspark/sql/tests/connect/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_vectorized_udf_check_config
pyspark/sql/tests/connect/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_vectorized_udf_invalid_length
pyspark/sql/tests/connect/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_bounded_mixed
pyspark/sql/tests/connect/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_bounded_simple
pyspark/sql/tests/connect/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_shrinking_window
pyspark/sql/tests/connect/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_sliding_window
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterParityTests::test_bucketed_write
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterParityTests::test_insert_into
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterParityTests::test_save_and_load
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterParityTests::test_save_and_load_builder
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterV2ParityTests::test_create_without_provider
pyspark/sql/tests/connect/test_parity_readwriter.py::ReadwriterV2ParityTests::test_table_overwrite
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_cast_to_string_with_udt
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_cast_to_udt_with_udt
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_complex_nested_udt_in_df
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_negative_decimal
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_parquet_with_udt
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_udf_with_udt
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_udt_with_none
pyspark/sql/tests/connect/test_parity_types.py::TypesParityTests::test_yearmonth_interval_type
pyspark/sql/tests/connect/test_parity_udf.py::UDFParityTests::test_nondeterministic_udf_in_aggregate
pyspark/sql/tests/connect/test_parity_udf.py::UDFParityTests::test_udf_in_join_condition
pyspark/sql/tests/connect/test_parity_udf.py::UDFParityTests::test_udf_not_supported_in_join_condition
pyspark/sql/tests/connect/test_parity_udf.py::UDFParityTests::test_udf_with_input_file_name
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_arrow_sql_conf
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_terminate
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_with_table_argument_malformed_query
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_with_table_argument_multiple
pyspark/sql/tests/connect/test_parity_udtf.py::ArrowUDTFParityTests::test_udtf_with_table_argument_unknown_identifier
pyspark/sql/tests/connect/test_parity_udtf.py::UDTFParityTests::test_udtf_terminate
pyspark/sql/tests/connect/test_parity_udtf.py::UDTFParityTests::test_udtf_with_table_argument_malformed_query
pyspark/sql/tests/connect/test_parity_udtf.py::UDTFParityTests::test_udtf_with_table_argument_multiple
pyspark/sql/tests/connect/test_parity_udtf.py::UDTFParityTests::test_udtf_with_table_argument_unknown_identifier
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_approx_equal_decimaltype_custom_rtol_pass
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_data_frame_equal_not_support_streaming
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_equal_approx_pandas_on_spark_df
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_equal_exact_pandas_on_spark_df
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_equal_nested_struct_str_duplicate
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_error_pandas_pyspark_df
pyspark/sql/tests/connect/test_utils.py::ConnectUtilsTests::test_assert_type_error_pandas_df

Copy link

github-actions bot commented Jun 25, 2025

Spark 4.0.0 Test Report

Commit Information

Commit Revision Branch
After 2d8d542 refs/pull/535/merge
Before 9ea0549 refs/heads/main

Test Summary

Suite Commit Failed Passed Skipped Warnings Time (s)
doctest-catalog After 15 10 11 5.96
Before 15 10 11 5.65
doctest-column After 2 33 10 6.45
Before 2 33 10 6.10
doctest-dataframe After 34 83 3 11 10.03
Before 34 83 3 11 9.74
doctest-functions After 216 226 10 16 21.09
Before 216 226 10 16 21.36
test-connect After 631 1110 181 369 273.38
Before 631 1110 181 369 276.07

Test Details

Error Counts
          898 Total
(+1)      321 Total Unique
-------- ---- ----------------------------------------------------------------------------------------------------------
           55 DocTestFailure
           41 IllegalArgumentException: missing argument: Python UDTF return type
           25 register data source command
           23 UnsupportedOperationException: unknown function: parse_json
           23 UnsupportedOperationException: with relations
           20 UnsupportedOperationException: streaming query manager command
           18 UnsupportedOperationException: handle add artifacts
           18 UnsupportedOperationException: named argument expression
           17 UnsupportedOperationException: write stream operation start
           14 IllegalArgumentException: invalid argument: invalid PySpark UDF type: 209
           14 IllegalArgumentException: invalid argument: invalid PySpark UDF type: 210
           14 UnsupportedOperationException: lambda function
           13 UnsupportedOperationException: variant data type
           12 AssertionError: AnalysisException not raised
           12 IllegalArgumentException: expected value at line 1 column 1
           12 UnsupportedOperationException: unresolved table valued function
           11 IllegalArgumentException: invalid argument: expected function for lateral table factor
           11 PySparkAssertionError: [DIFFERENT_PANDAS_DATAFRAME] DataFrames are not almost equal:
           10 AssertionError: 1 != 0 : dict_keys([])
           10 UnsupportedOperationException: unsupported data source format: "text"
            9 AssertionError: 3 != 0 : []
            9 AssertionError: False is not true
            9 UnsupportedOperationException: hint
            8 AssertionError
            8 UnsupportedOperationException: lateral join
            7 AnalysisException: Cannot cast to Decimal128(14, 7). Overflowing on NaN
            7 AnalysisException: map requires all value types to be the same
            6 PythonException:  IndexError: tuple index out of range
            6 UnsupportedOperationException: function: spark_partition_id
            6 UnsupportedOperationException: function: window
            6 UnsupportedOperationException: handle analyze is local
            5 AssertionError: `query_context_type` is required when QueryContext exists. QueryContext: [].
            5 PySparkTypeError: [UNSUPPORTED_DATA_TYPE_FOR_ARROW_CONVERSION] uint64 is not supported in conversion...
            5 UnsupportedOperationException: collect metrics
            5 UnsupportedOperationException: named function arguments
            5 UnsupportedOperationException: unpivot
            5 UnsupportedOperationException: user defined data type should only exist in a field
            5 checkpoint command
            4 AnalysisException: view not found: t2
            4 AssertionError: "TABLE_OR_VIEW_NOT_FOUND" does not match "No table named 'v'"
            4 AssertionError: AnalysisException not raised by <lambda>
            4 PythonException:  PySparkRuntimeError: [UDTF_EVAL_METHOD_ARGUMENTS_DO_NOT_MATCH_SIGNATURE] Failed to...
            4 UnsupportedOperationException: approx quantile
            4 UnsupportedOperationException: function: monotonically_increasing_id
            4 UnsupportedOperationException: sample by
            4 UnsupportedOperationException: unknown aggregate function: hll_sketch_agg
            4 UnsupportedOperationException: unknown function: listagg
            3 AssertionError: 1 != 0
            3 IllegalArgumentException: invalid argument: empty data source paths
            3 IllegalArgumentException: invalid argument: extraction must be a literal
(+3)        3 IllegalArgumentException: invalid argument: unsupported file format: TEXT
            3 UnboundLocalError: cannot access local variable 'q' where it is not associated with a value
            3 UnsupportedOperationException: PlanNode::CacheTable
            3 UnsupportedOperationException: function: from_json
            3 UnsupportedOperationException: function: input_file_name
            3 UnsupportedOperationException: function: make_interval
            3 UnsupportedOperationException: function: shuffle
            3 UnsupportedOperationException: function: ~
            3 UnsupportedOperationException: handle analyze input files
            3 UnsupportedOperationException: pivot
            3 UnsupportedOperationException: transpose
            3 UnsupportedOperationException: unknown table function: IDENTIFIER
            3 ValueError: Converting to Python dictionary is not supported when duplicate field names are present
            2 AnalysisException: Error parsing timestamp from '2015-04-08' using format 'yyyy-MM-dd': input contai...
            2 AnalysisException: Execution error: Function 'avg' user-defined coercion failed with "Error during p...
            2 AnalysisException: Invalid Python user-defined table function return type. Expect a struct type, but...
            2 AnalysisException: ambiguous attribute: ObjectName([Identifier("id")])
            2 AnalysisException: two values expected: [Column(Column { relation: None, name: "#2" }), Column(Colum...
            2 AnalysisException: view not found: array_struct
            2 AnalysisException: view not found: variant_table
            2 AssertionError: 3 != 0 : dict_keys([])
            2 AssertionError: unexpectedly None
            2 IllegalArgumentException: invalid argument: found FUNCTION at 5:13 expected 'DATABASE', 'SCHEMA', 'T...
            2 IllegalArgumentException: invalid argument: found PARTITION at 281:290 expected ',', or ')'
            2 IllegalArgumentException: invalid argument: found PARTITION at 295:304 expected ',', or ')'
            2 IllegalArgumentException: invalid argument: found PARTITION at 59:68 expected ',', or ')'
            2 IllegalArgumentException: invalid argument: found WITH at 171:175 expected ',', or ')'
            2 IllegalArgumentException: invalid argument: found WITH at 279:283 expected ',', or ')'
            2 IllegalArgumentException: invalid argument: invalid digit found in string
            2 IndexError: index out of bounds
            2 PySparkAssertionError: [DIFFERENT_ROWS] Results do not match: ( 99.50000 % )
            2 PythonException:  AssertionError: assert None is not None
            2 PythonException:  AttributeError: 'NoneType' object has no attribute 'cpus'
            2 PythonException:  KeyError: 'a'
            2 PythonException:  ZeroDivisionError: division by zero
            2 SparkRuntimeException: Spark `element_at`: expected a List or Map type, got Null
            2 SparkRuntimeException: start_from index out of bounds
            2 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
            2 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
(+2)        2 UnsupportedOperationException: Data Source option 'line_sep' is not supported yet.
(+2)        2 UnsupportedOperationException: Data Source option 'multi_line' is not supported yet.
            2 UnsupportedOperationException: LATERAL JOIN with criteria
            2 UnsupportedOperationException: Physical plan does not support logical expression Wildcard { qualifie...
            2 UnsupportedOperationException: clustering columns
            2 UnsupportedOperationException: freq items
            2 UnsupportedOperationException: function: bitmap_bit_position
            2 UnsupportedOperationException: function: convert_timezone
            2 UnsupportedOperationException: function: format_number
            2 UnsupportedOperationException: function: from_csv
            2 UnsupportedOperationException: function: inline
            2 UnsupportedOperationException: function: map_concat
            2 UnsupportedOperationException: function: map_entries
            2 UnsupportedOperationException: function: sec
            2 UnsupportedOperationException: function: shiftrightunsigned
            2 UnsupportedOperationException: handle analyze same semantics
            2 UnsupportedOperationException: list functions
            2 UnsupportedOperationException: position with 3 arguments is not supported yet
            2 UnsupportedOperationException: rebalance partitioning by expression
            2 UnsupportedOperationException: unknown aggregate function: collect_set
            2 UnsupportedOperationException: unknown function: dayname
            2 UnsupportedOperationException: unknown function: distributed_sequence_id
            2 UnsupportedOperationException: unknown function: from_xml
            2 UnsupportedOperationException: unknown function: is_valid_utf8
            2 UnsupportedOperationException: unknown function: monthname
            2 UnsupportedOperationException: unknown function: nullifzero
            2 UnsupportedOperationException: unknown function: randstr
            2 UnsupportedOperationException: unknown function: string_agg
            2 UnsupportedOperationException: unknown function: to_variant_object
            2 UnsupportedOperationException: unknown function: try_make_interval
            2 UnsupportedOperationException: unknown function: try_make_timestamp
            2 UnsupportedOperationException: unknown function: try_make_timestamp_ltz
            2 UnsupportedOperationException: unknown function: try_make_timestamp_ntz
            2 UnsupportedOperationException: unknown function: try_parse_json
            2 UnsupportedOperationException: unknown function: try_parse_url
            2 UnsupportedOperationException: unresolved regex
            2 UnsupportedOperationException: unsupported data source format: "orc"
            2 UnsupportedOperationException: wildcard with plan ID
            2 create resource profile command
            2 handle artifact statuses
            2 received metadata size exceeds hard limit (19714 vs. 16384);  :status:42B content-type:60B grpc-stat...
            1 AnalysisException: Cannot cast string 'Bob' to value of Int64 type
            1 AnalysisException: Cannot cast string 'abc' to value of Float64 type
            1 AnalysisException: Cannot cast value 'abc' to value of Boolean type
            1 AnalysisException: Cannot infer common argument type for comparison operation Boolean = Float64
            1 AnalysisException: Could not find config namespace "mapred"
            1 AnalysisException: Error parsing timestamp from '082017' using format '%m%Y': input is not enough fo...
            1 AnalysisException: Error parsing timestamp from '2014-31-12' using format '%Y-%d-%pa': input contain...
            1 AnalysisException: Error parsing timestamp from '2023-01-01' using format '%d-%m-%Y': input contains...
            1 AnalysisException: Failed to coerce arguments to satisfy a call to 'nth_value' function: coercion fr...
            1 AnalysisException: Failed to coerce arguments to satisfy a call to 'nth_value' function: coercion fr...
            1 AnalysisException: Failed to parse placeholder id: cannot parse integer from empty string
            1 AnalysisException: Inconsistent data type across values list at row 1 column 1. Was List(Field { nam...
            1 AnalysisException: Inconsistent data type across values list at row 1 column 1. Was Map(Field { name...
            1 AnalysisException: Second argument for `from_utc_timestamp` must be string, received None
            1 AnalysisException: Spark `weekofyear` function unsupported data type: Timestamp(Microsecond, Some("U...
            1 AnalysisException: Table 'tbl1' already exists
            1 AnalysisException: UNION queries have different number of columns: left has 2 columns whereas right ...
            1 AnalysisException: ambiguous attribute: ObjectName([Identifier("b")])
            1 AnalysisException: ambiguous attribute: ObjectName([Identifier("i")])
            1 AnalysisException: cannot resolve attribute: ObjectName([Identifier("x")])
(+1)        1 AnalysisException: gzip compression requires specifying a level such as gzip(4)
            1 AnalysisException: too big
            1 AnalysisException: unreachable
            1 AnalysisException: view not found: tab2
            1 AnalysisException: view not found: v2
(+1)        1 AssertionError: "CSV header does not conform to the schema" does not match "Data Source option 'enfo...
(+1)        1 AssertionError: "Database 'memory:3c8e443e-412b-4a9a-b6eb-2886355d0b40' dropped." does not match "in...
(+1)        1 AssertionError: "Database 'memory:e9b6f48b-54c1-44d9-9ee7-791714d19abe' dropped." does not match "in...
            1 AssertionError: "Invalid return type" does not match " AttributeError: 'Series' object has no attrib...
            1 AssertionError: "PARTITION_TRANSFORM_EXPRESSION_NOT_IN_PARTITIONED_BY" does not match "unknown funct...
            1 AssertionError: "UNRESOLVED_COLUMN.WITH_SUGGESTION" does not match "cannot resolve attribute: Object...
            1 AssertionError: "foobar" does not match "raise_error expects a single UTF-8 string argument"
            1 AssertionError: "requirement failed: Cogroup keys must have same size: 2 != 1" does not match "inval...
            1 AssertionError: '+---[17 chars]-----+\n|                        x|\n+--------[132 chars]-+\n' != '+-...
            1 AssertionError: '+---[23 chars]---+-----+\n|  1|    1|\n+---+-----+\nonly showing top 1 row' != '+--...
            1 AssertionError: 'deadbeef' is not None
            1 AssertionError: 0 not greater than 0
            1 AssertionError: 0.40248566366484795 != 0.9531453492357947 : Column<'rand(1)'>
            1 AssertionError: 6 != 0 : []
            1 AssertionError: ArrayIndexOutOfBoundsException not raised
            1 AssertionError: Exception not raised
            1 AssertionError: Lists differ: ["X'313233'", '123', '123', '123'] != ["CAST(X'313233' AS STRING)", 'C...
            1 AssertionError: Lists differ: ['a'] != ['b']
            1 AssertionError: Lists differ: [Row([14 chars] _c1=25, _c2='I am Hyukjin\n\nI love Spark!'),[86 chars...
            1 AssertionError: Lists differ: [Row(id=90, name='90'), Row(id=91, name='91'), Ro[176 chars]99')] != [...
            1 AssertionError: Lists differ: [Row(key='0'), Row(key='1'), Row(key='10'), Row(ke[1435 chars]99')] !=...
            1 AssertionError: Lists differ: [Row(ln(id)=0.0, ln(id)=0.0, struct(id, name)=Row(id=[1232 chars]0'))]...
            1 AssertionError: Row(point='[1.0, 2.0]', pypoint='[3.0, 4.0]') != Row(point='(1.0, 2.0)', pypoint='[3...
            1 AssertionError: SparkConnectGrpcException not raised
            1 AssertionError: StorageLevel(False, True, True, False, 1) != StorageLevel(False, False, False, False...
            1 AssertionError: Struc[30 chars]estampType(), True), StructField('val', IntegerType(), True)]) != Str...
            1 AssertionError: Struc[32 chars]e(), False), StructField('b', DoubleType(), Fa[158 chars]ue)]) != Str...
            1 AssertionError: Struc[40 chars]ue), StructField('val', ArrayType(DoubleType(), False), True)]) != St...
            1 AssertionError: True is not false : Default URL is not secure
            1 AssertionError: YearMonthIntervalType(0, 1) != YearMonthIntervalType(0, 0)
            1 AssertionError: [1.0, 2.0] != ExamplePoint(1.0,2.0)
            1 AttributeError: 'NoneType' object has no attribute 'extract_graph'
            1 AttributeError: 'NoneType' object has no attribute 'toText'
            1 FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/work/sail/sail/.venvs/test-spa...
(+1)        1 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp1tyqy0g6'
(+1)        1 FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpy1zr4l3n'
            1 IllegalArgumentException: column types must match schema types, expected Int64 but found List(Field ...
            1 IllegalArgumentException: column types must match schema types, expected LargeUtf8 but found Utf8 at...
            1 IllegalArgumentException: data did not match any variant of untagged enum JsonDataType
            1 IllegalArgumentException: invalid argument: empty data type
            1 IllegalArgumentException: invalid argument: field not found in input schema: col1
            1 IllegalArgumentException: invalid argument: found ( at 114:115 expected ':', data type, ',', or ')'
            1 IllegalArgumentException: invalid argument: found FUNCTION at 7:15 expected 'DATABASE', 'SCHEMA', 'O...
            1 IllegalArgumentException: invalid argument: found collate at 13:20 expected 'AS', identifier, '(', '...
            1 IllegalArgumentException: invalid argument: found something at 0:3 expected something else
            1 IllegalArgumentException: invalid argument: grouping sets with grouping expressions
            1 IllegalArgumentException: invalid argument: invalid user-defined window function type
(+1)        1 IllegalArgumentException: invalid argument: missing source
            1 ParseException: Invalid timezone "PST": failed to parse timezone
(+1)        1 PySparkAssertionError: Received incorrect server side session identifier for request. Please create ...
(+1)        1 PySparkAssertionError: Received incorrect server side session identifier for request. Please create ...
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] rdd is not implemented.
            1 PySparkNotImplementedError: [NOT_IMPLEMENTED] toJSON() is not implemented.
            1 PythonException:  AttributeError: 'NoneType' object has no attribute 'partitionId'
            1 PythonException:  AttributeError: 'list' object has no attribute 'x'
            1 PythonException:  AttributeError: 'list' object has no attribute 'y'
            1 PythonException:  TypeError: net.razorvine.pickle.PickleException: expected zero arguments for const...
            1 SparkRuntimeException: Failed due to a difference in schemas: original schema: DFSchema { inner: Sch...
            1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
            1 UnsupportedOperationException: Aggregate can not be used as a sliding accumulator because `retract_b...
(+1)        1 UnsupportedOperationException: Data Source option 'ignore_leading_white_space' is not supported yet.
(+1)        1 UnsupportedOperationException: Data Source option 'primitives_as_string' is not supported yet.
            1 UnsupportedOperationException: Insert into not implemented for this table
            1 UnsupportedOperationException: PlanNode::ClearCache
            1 UnsupportedOperationException: PlanNode::IsCached
            1 UnsupportedOperationException: SHOW FUNCTIONS
            1 UnsupportedOperationException: Support for 'approx_distinct' for data type Struct(name Utf8, value I...
            1 UnsupportedOperationException: as of join
            1 UnsupportedOperationException: bucketing
            1 UnsupportedOperationException: deduplicate within watermark
            1 UnsupportedOperationException: function exists
            1 UnsupportedOperationException: function: array_insert
            1 UnsupportedOperationException: function: array_sort
            1 UnsupportedOperationException: function: arrays_zip
            1 UnsupportedOperationException: function: bit_count
            1 UnsupportedOperationException: function: bit_get
            1 UnsupportedOperationException: function: bitmap_bucket_number
            1 UnsupportedOperationException: function: bitmap_count
            1 UnsupportedOperationException: function: bround
            1 UnsupportedOperationException: function: conv
            1 UnsupportedOperationException: function: csc
            1 UnsupportedOperationException: function: elt
            1 UnsupportedOperationException: function: format_string
            1 

(truncated)

Passed Tests Diff

(empty)

Failed Tests
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.cacheTable
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.clearCache
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.createTable
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.functionExists
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.getDatabase
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.getFunction
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.isCached
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.listColumns
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.listDatabases
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.listFunctions
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.listTables
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.recoverPartitions
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.refreshByPath
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.refreshTable
pyspark/sql/catalog.py::pyspark.sql.catalog.Catalog.uncacheTable
pyspark/sql/column.py::pyspark.sql.column.Column.try_cast
pyspark/sql/column.py::pyspark.sql.column.Column.when
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame._joinAsOf
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.approxQuantile
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.cache
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.coalesce
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.colRegex
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.dropDuplicatesWithinWatermark
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.dropna
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.exists
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.explain
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.groupingSets
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.hint
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.inputFiles
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.isLocal
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.isStreaming
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.lateralJoin
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.localCheckpoint
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.observe
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.persist
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.randomSplit
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.rdd
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.repartition
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.repartitionByRange
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.sameSemantics
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.sampleBy
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.scalar
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.storageLevel
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.toJSON
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.transpose
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.unpivot
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.withWatermark
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrame.writeStream
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrameNaFunctions.drop
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrameStatFunctions.approxQuantile
pyspark/sql/dataframe.py::pyspark.sql.dataframe.DataFrameStatFunctions.sampleBy
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.aes_decrypt
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.aes_encrypt
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.aggregate
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.approx_count_distinct
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.approx_percentile
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.array_agg
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.array_append
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.array_contains
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.array_insert
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.array_position
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.array_prepend
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.array_size
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.array_sort
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.arrays_overlap
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.arrays_zip
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.atan2
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.bit_count
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.bit_get
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.bitmap_bit_position
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.bitmap_bucket_number
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.bitmap_construct_agg
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.bitmap_count
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.bitmap_or_agg
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.bitwise_not
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.broadcast
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.bround
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.collation
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.collect_set
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.concat
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.conv
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.convert_timezone
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.corr
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.cosh
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.cot
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.countDistinct
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.count_distinct
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.count_if
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.count_min_sketch
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.create_map
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.csc
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.current_database
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.current_schema
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.date_part
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.datepart
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.dayname
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.degrees
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.elt
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.exists
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.exp
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.explode
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.explode_outer
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.extract
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.filter
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.first
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.flatten
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.forall
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.format_number
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.format_string
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.from_csv
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.from_json
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.from_utc_timestamp
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.from_xml
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.getbit
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.grouping_id
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.hash
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.histogram_numeric
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.hll_sketch_agg
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.hll_sketch_estimate
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.hll_union
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.hll_union_agg
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.ilike
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.inline
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.inline_outer
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.input_file_block_length
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.input_file_block_start
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.input_file_name
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.is_valid_utf8
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.is_variant_null
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.java_method
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.json_array_length
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.json_object_keys
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.json_tuple
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.kurtosis
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.last
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.like
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.listagg
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.listagg_distinct
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.locate
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.log
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.log10
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.log2
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.ltrim
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.make_dt_interval
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.make_interval
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.make_timestamp
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.make_timestamp_ltz
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.make_valid_utf8
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.map_concat
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.map_contains_key
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.map_entries
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.map_filter
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.map_from_entries
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.map_keys
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.map_values
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.map_zip_with
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.median
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.mode
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.monotonically_increasing_id
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.monthname
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.months_between
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.nth_value
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.nullifzero
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.parse_json
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.parse_url
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.percentile
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.percentile_approx
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.posexplode
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.posexplode_outer
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.position
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.positive
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.printf
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.product
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.rand
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.randn
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.randstr
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.reduce
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.reflect
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.regexp_extract
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.regexp_extract_all
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.regexp_instr
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.regexp_replace
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.regexp_substr
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.regr_avgx
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.regr_avgy
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.round
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.rtrim
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.schema_of_csv
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.schema_of_json
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.schema_of_variant
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.schema_of_variant_agg
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.schema_of_xml
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.sec
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.sentences
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.session_window
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.shiftrightunsigned
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.shuffle
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.sin
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.size
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.skewness
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.soundex
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.spark_partition_id
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.split
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.stack
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.str_to_map
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.string_agg
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.string_agg_distinct
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.tan
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.timestamp_add
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.timestamp_diff
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_char
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_csv
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_json
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_number
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_timestamp_ltz
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_timestamp_ntz
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_unix_timestamp
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_utc_timestamp
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_varchar
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_variant_object
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.to_xml
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.transform
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.transform_keys
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.transform_values
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.trim
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_add
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_aes_decrypt
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_avg
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_divide
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_make_interval
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_make_timestamp
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_make_timestamp_ltz
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_make_timestamp_ntz
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_mod
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_multiply
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_parse_json
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_parse_url
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_reflect
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_subtract
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_sum
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_to_number
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_url_decode
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_validate_utf8
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.try_variant_get
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.udf
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.udtf
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.uniform
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.url_decode
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.url_encode
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.validate_utf8
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.variant_get
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.weekofyear
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.width_bucket
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.window
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.window_time
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xpath
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xpath_boolean
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xpath_double
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xpath_float
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xpath_int
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xpath_long
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xpath_number
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xpath_short
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xpath_string
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.xxhash64
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.zeroifnull
pyspark/sql/functions/builtin.py::pyspark.sql.functions.builtin.zip_with
pyspark/sql/tests/connect/arrow/test_parity_arrow.py::ArrowParityTests::test_createDataFrame_pandas_duplicate_field_names
pyspark/sql/tests/connect/arrow/test_parity_arrow.py::ArrowParityTests::test_negative_and_zero_batch_size
pyspark/sql/tests/connect/arrow/test_parity_arrow.py::ArrowParityTests::test_pandas_self_destruct
pyspark/sql/tests/connect/arrow/test_parity_arrow.py::ArrowParityTests::test_toPandas_duplicate_field_names
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_apply_in_arrow
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_apply_in_arrow_column_order
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_apply_in_arrow_empty_groupby
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_apply_in_arrow_not_returning_arrow_table
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_apply_in_arrow_returning_empty_dataframe
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_apply_in_arrow_returning_empty_dataframe_and_wrong_column_names
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_apply_in_arrow_returning_wrong_column_names
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_apply_in_arrow_returning_wrong_types
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_apply_in_arrow_returning_wrong_types_positional_assignment
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_positional_assignment_conf
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_self_join
pyspark/sql/tests/connect/arrow/test_parity_arrow_cogrouped_map.py::CogroupedMapInArrowParityTests::test_with_local_data
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow_column_order
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow_empty_groupby
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow_not_returning_arrow_table
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow_returning_empty_dataframe
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow_returning_empty_dataframe_and_wrong_column_names
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow_returning_wrong_column_names
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow_returning_wrong_types
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow_returning_wrong_types_positional_assignment
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_apply_in_arrow_with_key
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_positional_assignment_conf
pyspark/sql/tests/connect/arrow/test_parity_arrow_grouped_map.py::GroupedApplyInArrowParityTests::test_self_join
pyspark/sql/tests/connect/arrow/test_parity_arrow_map.py::ArrowMapParityTests::test_map_in_arrow_with_barrier_mode
pyspark/sql/tests/connect/arrow/test_parity_arrow_map.py::ArrowMapParityTests::test_negative_and_zero_batch_size
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_chained_udfs_with_variant
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_kwargs
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_named_arguments
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_named_arguments_and_defaults
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_named_arguments_negative
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_nondeterministic_udf
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_nondeterministic_udf2
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_nondeterministic_udf_in_aggregate
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_nonparam_udf_with_aggregate
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_num_arguments
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_cache
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_globals_not_overwritten
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_in_join_condition
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_not_supported_in_join_condition
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_with_complex_variant_input
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_with_complex_variant_output
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_with_input_file_name
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_with_udt
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_with_variant_input
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::ArrowPythonUDFParityTests::test_udf_with_variant_output
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_chained_udfs_with_variant
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_kwargs
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_named_arguments
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_named_arguments_and_defaults
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_named_arguments_negative
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_nondeterministic_udf_in_aggregate
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_cache
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_in_join_condition
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_not_supported_in_join_condition
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_with_complex_variant_input
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_with_complex_variant_output
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_with_input_file_name
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_with_udt
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_with_variant_input
pyspark/sql/tests/connect/arrow/test_parity_arrow_python_udf.py::UDFParityTests::test_udf_with_variant_output
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_archive
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_file
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_pyfile
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_add_zipped_package
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_artifacts_cannot_be_overwritten
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_cache_artifact
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_copy_from_local_to_fs
pyspark/sql/tests/connect/client/test_artifact.py::ArtifactTests::test_single_chunked_and_chunked_artifact
pyspark/sql/tests/connect/client/test_artifact_localcluster.py::LocalClusterArtifactTests::test_add_archive
pyspark/sql/tests/connect/client/test_artifact_localcluster.py::LocalClusterArtifactTests::test_add_file
pyspark/sql/tests/connect/client/test_artifact_localcluster.py::LocalClusterArtifactTests::test_add_pyfile
pyspark/sql/tests/connect/client/test_artifact_localcluster.py::LocalClusterArtifactTests::test_add_zipped_package
pyspark/sql/tests/connect/client/test_artifact_localcluster.py::LocalClusterArtifactTests::test_artifacts_cannot_be_overwritten
pyspark/sql/tests/connect/client/test_client.py::SparkConnectClientTestCase::test_properties
pyspark/sql/tests/connect/pandas/test_parity_pandas_cogrouped_map.py::CogroupedApplyInPandasTests::test_case_insensitive_grouping_column
pyspark/sql/tests/connect/pandas/test_parity_pandas_cogrouped_map.py::CogroupedApplyInPandasTests::test_different_group_key_cardinality
pyspark/sql/tests/connect/pandas/test_parity_pandas_cogrouped_map.py::CogroupedApplyInPandasTests::test_self_join
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map.py::GroupedApplyInPandasTests::test_case_insensitive_grouping_column
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map.py::GroupedApplyInPandasTests::test_grouped_over_window
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map.py::GroupedApplyInPandasTests::test_grouped_over_window_with_key
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map_with_state.py::GroupedApplyInPandasWithStateTests::test_apply_in_pandas_with_state_basic
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map_with_state.py::GroupedApplyInPandasWithStateTests::test_apply_in_pandas_with_state_basic_fewer_data
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map_with_state.py::GroupedApplyInPandasWithStateTests::test_apply_in_pandas_with_state_basic_more_data
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map_with_state.py::GroupedApplyInPandasWithStateTests::test_apply_in_pandas_with_state_basic_no_state
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map_with_state.py::GroupedApplyInPandasWithStateTests::test_apply_in_pandas_with_state_basic_no_state_no_data
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map_with_state.py::GroupedApplyInPandasWithStateTests::test_apply_in_pandas_with_state_basic_with_null
pyspark/sql/tests/connect/pandas/test_parity_pandas_grouped_map_with_state.py::GroupedApplyInPandasWithStateTests::test_apply_in_pandas_with_state_python_worker_random_failure
pyspark/sql/tests/connect/pandas/test_parity_pandas_map.py::MapInPandasParityTests::test_large_variable_types
pyspark/sql/tests/connect/pandas/test_parity_pandas_map.py::MapInPandasParityTests::test_map_in_pandas_with_barrier_mode
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf.py::PandasUDFParityTests::test_pandas_udf_basic_with_return_type_string
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf.py::PandasUDFParityTests::test_pandas_udf_return_type_error
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf.py::PandasUDFParityTests::test_udf_wrong_arg
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_grouped_agg.py::PandasUDFGroupedAggParityTests::test_invalid_args
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_grouped_agg.py::PandasUDFGroupedAggParityTests::test_kwargs
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_grouped_agg.py::PandasUDFGroupedAggParityTests::test_named_arguments
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_grouped_agg.py::PandasUDFGroupedAggParityTests::test_named_arguments_and_defaults
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_grouped_agg.py::PandasUDFGroupedAggParityTests::test_named_arguments_negative
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_chained_udfs_with_complex_variant
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_chained_udfs_with_variant
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_kwargs
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_named_arguments
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_named_arguments_and_defaults
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_named_arguments_negative
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_nondeterministic_vectorized_udf_in_aggregate
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_scalar_iter_udf_init
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_udafs_with_complex_variant_input
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_udafs_with_complex_variant_output
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_udafs_with_variant_input
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_udafs_with_variant_output
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_udf_with_nested_variant_input
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_udf_with_variant_input
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_udf_with_variant_nested_output
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_udf_with_variant_output
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_vectorized_udf_check_config
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_scalar.py::PandasUDFScalarParityTests::test_vectorized_udf_invalid_length
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_bounded_mixed
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_bounded_simple
pyspark/sql/tests/connect/pandas/test_parity_pandas_udf_window.py::PandasUDFWindowParityTests::test_invalid_args
pyspark/s

(truncated)

@@ -163,6 +163,7 @@ mod retry_strategy {
#[serde(deny_unknown_fields)]
pub struct ExecutionConfig {
pub batch_size: usize,
pub collect_statistics: bool,
Copy link
Contributor Author

@shehabgamin shehabgamin Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DataFusion 48.0.0 introduces a significant performance regression. Instead of waiting for the next release which changes the default value of this config to true, we can just expose it and control it on our end.

Context:
apache/datafusion#16447
apache/datafusion#16486

@@ -206,12 +206,20 @@
description: The batch size for physical plan execution.
experimental: true

- key: execution.collect_statistics
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shehabgamin shehabgamin requested a review from linhr June 26, 2025 06:20
@shehabgamin shehabgamin marked this pull request as ready for review June 26, 2025 06:21
@shehabgamin shehabgamin requested a review from lonless9 June 26, 2025 06:21
Copy link
Contributor

@linhr linhr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! 🚀

The DataSourceOptions trait looks really nice. And thanks for having all the tests!

@@ -225,13 +233,62 @@ def test_sql_temp_view(spark, df, df_view):
assert_frame_equal(spark.sql(f"SELECT * FROM {df_view}").toPandas(), df.toPandas()) # noqa: S608


def test_write(spark, df, tmpdir):
# CHECK HERE: DO NOT MERGE IF THIS COMMENT IS HERE!!!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😆

@lonless9
Copy link
Contributor

Well done!!!

I wonder if we can open an issue to track the options that are not yet supported, or perhaps this issue already exists and I overlooked it.

@shehabgamin shehabgamin merged commit d6ff160 into main Jun 26, 2025
13 checks passed
@shehabgamin shehabgamin deleted the file-configs-cont branch June 26, 2025 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants