In your daily work you've certainly met the problem of Table or view not found. The error happens in the situations like this:
val customerIds = JoinHelper.insertCustomers(1) JoinHelper.insertOrders(customerIds, 4) val jdbcOptions = Map("url" -> InMemoryDatabase.DbConnection, "user" -> InMemoryDatabase.DbUser, "password" -> InMemoryDatabase.DbPassword, "driver" -> InMemoryDatabase.DbDriver, "dbtable" -> "orders") val ordersDataFrame = sparkSession.read.format("jdbc") .options(jdbcOptions) .load() val ordersAmounts = ordersDataFrame.sqlContext.sql("SELECT amount FROM orders WHERE customers_id IS NOT NULL") val amounts = ordersAmounts.collect().map(row => (row.getAs[java.math.BigDecimal]("amount"))) amounts should have size 4 // amounts are set randomly, so not assert on them
Unfortunately the code doesn't work because of the error:
Table or view not found: orders; line 1 pos 19 org.apache.spark.sql.AnalysisException: Table or view not found: orders; line 1 pos 19
After all, why it's produced ? The table really exists in our in-memory database. In fact, Apache Spark SQL uses a concept of catalog to list all tables and views querable directly with SQL. The catalog's implementation is SessionCatalog class and in its lookupRelation(name: TableIdentifier): LogicalPlan method it tries to get a LogicalPlan representing the table or view. Since the engine doesn't register the tables automatically, it's mandatory to do it explicitly with createOrReplaceTempView(tableName: String) method:
val customerIds = JoinHelper.insertCustomers(1) JoinHelper.insertOrders(customerIds, 4) val jdbcOptions = Map("url" -> InMemoryDatabase.DbConnection, "user" -> InMemoryDatabase.DbUser, "password" -> InMemoryDatabase.DbPassword, "driver" -> InMemoryDatabase.DbDriver, "dbtable" -> "orders") val ordersDataFrame = sparkSession.read.format("jdbc") .options(jdbcOptions) .load() ordersDataFrame.createOrReplaceTempView("orders") val ordersAmounts = sparkSession.sqlContext.sql("SELECT amount FROM orders WHERE customers_id IS NOT NULL") val amounts = ordersAmounts.collect().map(row => (row.getAs[java.math.BigDecimal]("amount"))) amounts should have size 4 // amounts are set randomly, so not assert on them