Google BigQuery

Google BigQuery: preview

It is a datawarehouse, not a relational database!

It is good at enquiring big data, not to write single records!

Main features:

  • No Primary key is needed

  • No indexes are required

  • Cost per amount of data analyzed:

    • analyzed, NOT read

    • data (bytes per record x number of records analyzed), not number of records

  • Bulk insert very fast

  • Not optimized for single records writing

Google BigQuery: create a dataset

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Create a new dataset into Google BigQuery; a dataset is a group of BigQuery tables logically dependent with each other.

Syntax

utils.createBigQueryDataset(datasetName);

Details

Google BigQuery: delete a dataset

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Delete an already existing dataset into Google BigQuery.

Syntax

utils.deleteBigQueryDataset(datasetName);

Details

Google BigQuery: create a table

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Create a new table in the specified dataset of Google BigQuery. When creating a table in BigQuery, you need to specify a list of fields, each expressed with a name and a type.

Syntax

utils.createBigQueryTable(datasetName, tableName, columns);

Details

Google BigQuery: delete a table

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Delete an already existing table from the specified dataset.

Syntax

utils.deleteBigQueryTable(datasetName, tableName);

Details

Google BigQuery: execute a SQL query and get results one row a time as a list of values

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Execute the specified SQL query on the BigQuery service and get back the result, a record a time, as a list of values.

Syntax

utils.executeBigQueryWithCallback(processRowFunName, defaultDataset, sqlQuery, params);

Details

Google BigQuery: execute a SQL query and get results one row a time as an object

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Execute the specified SQL query on the BigQuery service and get back the result, a record a time, expressed as a javascript object.

Syntax

utils.executeBigQueryWithObjectCallback(processRowFunName, defaultDataset, sqlQuery, params);

Details

Google BigQuery: execute a SQL query and save results in a BigQuery table

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Execute the specified SQL query on the BigQuery service and save the result on a BigQuery table, created automatically.

Syntax

utils.executeBigQuerySaveOnTable(destinationDataset, destinationTable, defaultDatset, sqlQuery, params);

Details

Google BigQuery: execute a SQL query and save results in a relational db table

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Execute the specified SQL query on the BigQuery service and save the result on a local table, stored in the relational database.

Syntax

utils.executeBigQuerySaveOnLocalTable(
  datastoreId, 
  localTableName,
  defaultFieldNames, 
  csvFields,
  defaultDataset,
  sqlQuery,
  params
);

Details

Example

utils.executeBigQuerySaveOnLocalTable(
  null, // datastoreId, null for default db schema
  "MYTABLE",
  { 
    USER_ID_CREATE_DATE: userInfo.username,
    ROW_VERSION: 1
  }, // default values
  [
    { fieldName: "CODE" },
    {}, // no mapping to a field
    { fieldName: "DESCRIPTION" }
  ], // csv fields
  "MYBIGQUERY_DATASET", // defaultDataset,
  "SELECT CODE,DESCRIPTION FROM MYBIGQUERYTABLE WHERE COMPANY_ID=? ", // SQL BigQuery
  [userInfo.companyId] // values for bind variables
);

Google BigQuery: execute a SQL query and save results in a text file in GCS

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Extract data from the specified BigQuery table to a Google Cloud Storage file. Fields separator is always the comma (,)

  • You can export up to 1 GB of table data to a single file.

  • If you are exporting more than 1 GB of data, use a wildcard to export the data into multiple files.

  • When you export data to multiple files, the size of the files will vary.

  • You cannot export nested and repeated data in CSV format. Nested and repeated data is supported for Avro and JSON exports.

  • You cannot export data from multiple tables in a single export job.

Syntax

utils.extractBigQueryData(datasetName, tableName, format, gcsUrl, deleteTableAfterExport, printHeaders);

Details

Examples for gcsUrl:

gs://my-bucket/file-name.json

gs://my-bucket/file-name-*.json

gs://my-bucket/path-component-* /file-name.json

Google BigQuery: import data from a local text file to BigQuery table

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Import a text file stored in the server file system to a BigTable table.

When you load CSV or JSON data, values in DATE columns must use the dash (-) separator and the date must be in the following format: YYYY-MM-DD (year-month-day).

When you load JSON or CSV data, values in TIMESTAMP columns must use a dash (-) separator for the date portion of the timestamp, and the date must be in the following format: YYYY-MM-DD (year-month-day).

The hh:mm:ss (hour-minute-second) portion of the timestamp must use a colon (:) separator.

Syntax

var numberOfImportedRows = utils.importBigQueryDataFromLocalDatasource(
  datasetName, tableName, location,
  format, csvPath, encoding, separator, maxBadErrors, truncate, deleteSrcFileAfterImport
);

Details

Google BigQuery: import multiple local files to BigQuery tables

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Import a text file stored in the server file system to a BigTable table.

When you load CSV or JSON data, values in DATE columns must use the dash (-) separator and the date must be in the following format: YYYY-MM-DD (year-month-day).

When you load JSON or CSV data, values in TIMESTAMP columns must use a dash (-) separator for the date portion of the timestamp, and the date must be in the following format: YYYY-MM-DD (year-month-day).

The hh:mm:ss (hour-minute-second) portion of the timestamp must use a colon (:) separator.

Syntax

var numberOfImportedRows = utils.importMultipleDataFromLocalDatasource(
  datasetName, tableNames, location,
  format, baseDir, csvFiles, encoding, separator, maxBadErrors, truncate, deleteSrcFileAfterImport
);

Details

Google BigQuery: import data from a GCS text file to BigQuery table

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Import a text file stored in GCS to a BigTable table.

When you load CSV or JSON data, values in DATE columns must use the dash (-) separator and the date must be in the following format: YYYY-MM-DD (year-month-day).

When you load JSON or CSV data, values in TIMESTAMP columns must use a dash (-) separator for the date portion of the timestamp, and the date must be in the following format: YYYY-MM-DD (year-month-day).

The hh:mm:ss (hour-minute-second) portion of the timestamp must use a colon (:) separator.

Syntax

var numberOfImportedRows = utils.importBigQueryDataFromGCS(
  datasetName, tableName, format, 
  sourceUri, encoding, separator
);

Details

Examples for gcsUrl:

gs://my-bucket/file-name.json

gs://my-bucket/file-name-*.json

Google BigQuery: import data from a GCS text file to BigQuery table using a specific fields schema

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Import a text file stored in GCS to a BigTable table: not all fields must be specified, since you have to define the CSV schema, in terms of field names, so that the CSV can contain only fields to fill in and not all table fields.

When you load CSV or JSON data, values in DATE columns must use the dash (-) separator and the date must be in the following format: YYYY-MM-DD (year-month-day).

When you load JSON or CSV data, values in TIMESTAMP columns must use a dash (-) separator for the date portion of the timestamp, and the date must be in the following format: YYYY-MM-DD (year-month-day).

The hh:mm:ss (hour-minute-second) portion of the timestamp must use a colon (:) separator.

Syntax

var numberOfImportedRows = utils.importDataFromGCSWithSchema(
  datasetName, tableName, format, 
  sourceUri, encoding, separator,
  [] fieldNames
);

Details

Examples for gcsUrl:

gs://my-bucket/file-name.json

gs://my-bucket/file-name-*.json

Example:

var processedRows = utils.importDataFromGCSWithSchema(
  "myDatasetName" ,
  "MyTableName",
  "CSV", // or "JSON"
  "gs://bucketName/myfilename.csv",
  "UTF-8", // or other charset",
  ",", // CSV fields separator: , or ;
  ["ID","ROW_VERSION","TESTO"] // array of fields names
);

Google BigQuery: import data as a streaming to BigQuery table

In order to work with BigQuery, you need to setup Google settings and assign to the auth key permissions about BigQuery.

Import a single record as a streaming.

When you load CSV or JSON data, values in DATE columns must use the dash (-) separator and the date must be in the following format: YYYY-MM-DD (year-month-day).

When you load JSON or CSV data, values in TIMESTAMP columns must use a dash (-) separator for the date portion of the timestamp, and the date must be in the following format: YYYY-MM-DD (year-month-day).

The hh:mm:ss (hour-minute-second) portion of the timestamp must use a colon (:) separator.

Syntax

var errorsString = utils.importBigQueryThroughStreaming(datasetName, tableName, record);

Details

Return errors during import, expressed as a JSON map; null if there are NOT errors

Google BigQuery: execute a SQL query

It is possible to execute a SQL query on a single BigQuery table and get the whole result set as a JSON string.

Syntax

var json = utils.executeQueryOnBigQuery(
  String sql,
  Long dataModelId,
  Boolean interruptExecution,
  Object... pars
);

Note: if your SQL query contains a SELECT clause like this one:

SELECT TABLE1.FIELD_A,TABLE2.FIELD_B FROM ...

as a result, the returning JSON would contain:

{
  table1: {
    fieldA: "..."
  },
  table2: {
    fieldB: "..."
  }
}

that is, an inner object is created for each referred table.

If you don't want inner objects in return, you have two choices:

  • set an alias for each field referred in the SELECT clause

  • use executeQueryWithSettings and pass forward the directive { noInnerObjects: true }

Important note: do NOT use this method to retrieve a long result set (more than a hundred records).

Google BigQuery: execute a SQL query with settings

It is possible to execute a SQL query on a single BigQuery table and get the whole result set as a JSON string.

Syntax

var json = utils.executeQueryOnBigQueryWithSettings(
  String sql,
  Long dataModelId,
  Boolean interruptExecution,
  Map settings,
  Object... pars
);

Important note: do NOT use this method to retrieve a long result set (more than a hundred records).

Google BigQuery: execute a SQL query and retrieve a block of data

It is possible to execute a SQL query on a single BigQuery table and get a partial result set, a block of data to fill in a grid. This method is helpful within a server-side javascript business component bounded to a grid.

Syntax

var json = utils.getPartialResultOnBigQueryWithSettings(
  String sql,
  Long dataModelId,
  Boolean interruptExecution,
  Map settings,
  Object... pars
);

Details

This method also parses the HTTP request parameters passed forward by the calling grid (current block of data and block size) as well as filtering and sorting conditions coming from the grid.

Note: if your SQL query contains a SELECT clause like this one:

SELECT TABLE1.FIELD_A,TABLE2.FIELD_B FROM ...

as a result, the returning JSON would contain:

{
  table1: {
    fieldA: "..."
  },
  table2: {
    fieldB: "..."
  }
}

that is, an inner object is created for each referred table.

If you don't want inner objects in return, you have two choices:

  • set an alias for each field referred in the SELECT clause

  • pass forward the directive { noInnerObjects: true }

Note: the settings argument is optional; if set, it contains a javascript object containing a series of directives:

  • noInnerObjects: true - returns a list of records, each without inner objects, i.e. a plain record

  • select, from, where, groupBy, having, orderBy: optionals; in case of very complex queries (e.g. inner selects, union, join with select) it is recommended to let "sql" argument set to null and use these properties to explicitely define the parts of a SQL query; do not include the keywords SELECT, FROM, etc.; select + from are mandatory, in case "sql" argument is let empty.

Example:

var json = utils.getPartialResultOnBigQueryWithSettings(
  null, // sql
  xyz, // dataModelId,
  true, // interruptExecution,
  { // settings
    select: "FIELD1,FIELD2",
    from: "TABLE1 INNER JOIN TABLE2 ON..."
  },
  [...] // pars
);

Google BigQuery: insert records in a table from a list of javascript objects

This method is helpful when you need to insert one or more records in a BigQuery table, starting from the data model id related to such a table.

Syntax

var ok = utils.insertObjectsOnBigQuery(
  Map[] objects,
  Long dataModelId,
  Boolean interruptExecution
);

Details

Google BigQuery: update a single record in a table from a javascript object

This method is helpful when you need to update a record in a BigQuery table, starting from the data model id related to such a table.

Syntax

var ok = utils.updateObjectOnBigQuery(
  Map object,
  Long dataModelId,
  Boolean interruptExecution
);

Details

Google BigQuery: update multiple records in a table from a list of javascript objects

This method is helpful when you need to update one or more records in a BigQuery table, starting from the data model id related to such a table.

Syntax

var ok = utils.updateObjectsOnBigQuery(
  Map[] objects,
  Long dataModelId,
  Boolean interruptExecution
);

Details

Google BigQuery: delete one or more records starting from a list of javascript objects

This method is helpful when you need to update one or more records in a BigQuery table, starting from the data model id related to such a table.

Syntax

var ok = utils.deleteObjectOnBigQuery(
  Map object,
  Long dataModelId,
  Boolean interruptExecution
);

Details

Google BigQuery: insert a large number of record in a table from Google Datastore

This method is helpful when you need to execute a bulk insert of records in a BigQuery table, starting from records coming from Google Datastore.

The import is very fast and it is based on the creation of a csv file behind the scenes, containing data read from Datastore and to load in BigQuery using a special utility method provided by BigQuery. This file must be temporarely stored in Google Cloud Storage (managed internally by Platform) and for that reason a GCS "region" (location: US, EU, etc) is required.

Syntax

var ok = bulkImportFromDSToBigQuery(
  String gql,
  Long datastoreDataModelId,
  String location,
  String bigQueryTable,
  boolean interruptExecution,
  Object...pars
);

Details

Google BigQuery: executing a DML instruction

This method is helpful when you need to execute a SQL instruction of any kind (e.g a bulk DELETE or UPDATE) on a BigQuery table.

Syntax

var processedRecords = executeBigQueryDmlStatement(
  String datasetName,
  String sql,
  Object...pars
);

Details

Last updated