ExportToDatabase

Settings:

Database type

Specify the type of database you want to use:

MySQL: Writes the data directly to a MySQL database. MySQL is open-source software; you may require help from your local Information Technology group to set up a database server.
MySQL / CSV: Writes a script file that contains SQL statements for creating a database and uploading the Per_Image and Per_Object tables. This option will write out the Per_Image and Per_Object table data to two CSV files; you can use these files can be used to import the data directly into an application that accepts CSV data.
SQLite: Writes SQLite files directly. SQLite is simpler to set up than MySQL and can more readily be run on your local computer rather than requiring a database server. More information about SQLite can be found here.

If running this module on a computing cluster, there are a few considerations to note:

The MySQL option is well-suited for cluster use, since multiple jobs can write to the database simultaneously.
The SQLite option is not as appropriate; a SQLite database only allows access by one job at a time.

Experiment name

Select a name for the experiment. This name will be registered in the database and linked to the tables that ExportToDatabase creates. You will be able to select the experiment by name in CellProfiler Analyst and will be able to find the experiment's tables through database queries.

Database name

Select a name for the database you want to use

SQL file prefix

(Used if MySQL / CSV is selected as the database type)
Enter the prefix to be used to name the SQL file.

Name the SQLite database file

(Used if SQLite selected as database type)
Enter the name of the SQLite database filename to which you want to write.

Overwrite without warning?

ExportToDatabase creates tables and databases at the start of a run when writing directly to a MySQL or SQLite database. It writes SQL scripts and CSVs when not writing directly. It also can write CellProfiler Analysis property files. In some cases, it is appropriate to run CellProfiler and append to or overwrite the data in existing tables, for instance when running several CellProfiler instances which each process a range of the experiment's image sets. In other cases, such as when the measurements to be written have changed, the data tables must be dropped completely.
You can choose fromm three options to conrtol overwriting behavior:

Never: ExportToDatabase will ask before dropping and recreating tables unless you are running headless. CellProfiler will exit if running headless if the tables exist and this option is chosen.
Data only: ExportToDatabase will keep the existing tables if present and will overwrite the data. Choose Data only if you are breaking your experiment into ranges of image sets and running each range on a separate instance of CellProfiler.
Data and schema: ExportToDatabase will drop previous versions of tables at the start of a run. This option is appropriate if you are using the CreateBatchFiles module; your tables will be created by the run that creates the batch data file. The actual analysis runs that utilize the Batch_data file will use the existing tables without trying to recreate them.

Add a prefix to table names?

Select whether you want to add a prefix to your table names. The default table names are Per_Image for the per-image table and Per_Object for the per-object table. Adding a prefix can be useful for bookkeeping purposes.

Select Yes to add a user-specified prefix to the default table names. If you want to distinguish multiple sets of data written to the same database, you probably want to use a prefix.
Select No to use the default table names. For a one-time export of data, this option is fine.

Whether you chose to use a prefix or not, CellProfiler will warn you if your choice entails overwriting an existing table.

Table prefix

(Used if Add a prefix to table names? is selected)
Enter the table prefix you want to use.

MySQL has a 64 character limit on the full name of the table. If the combination of the table name and prefix exceeds this limit, you will receive an error associated with this setting.

Create a CellProfiler Analyst properties file?

Select Yes to generate a template properties file that will allow you to use your new database with CellProfiler Analyst (a data exploration tool which can also be downloaded from http://www.cellprofiler.org/). The module will attempt to fill in as many as the entries as possible based on the pipeline's settings, including the server name, username and password if MySQL is used.

Which objects should be used for locations?

(Used only if creating a properties file)
CellProfiler Analyst displays cells during classification. This setting determines which object centers will be used as the center of the cells to be displayed. Choose one of the listed objects and CellProfiler will save that object's location columns in the properties file so that CellProfiler Analyst centers cells using that object's center.

You can manually change this choice in the properties file by edting the cell_x_loc and cell_y_loc properties.

Note that if there are no objects defined in the pipeline (e.g. if only using MeasureImageQuality and/or Illumination Correction modules), a warning will diplay until you choose 'None' for the subsequent setting: 'Export measurements for all objects to the database?'.

Access CPA images via URL?

(Used only if creating a properties file)
The image paths written to the database will be the absolute path the the image files on your computer. If you plan to make these files accessible via the web, you can have CellProfiler Analyst prepend a URL to your file name. Eg: If an image is loaded from the path "/cellprofilerhttp://d1zymp9ayga15t.cloudfront.net/images/" and you use a url prepend of "http://mysite.com/", CellProfiler Analyst will look for your file at "http://mysite.com/cellprofilerhttp://d1zymp9ayga15t.cloudfront.net/images/"

Enter an image url prepend if you plan to access your files via http

(Used only if accessing CellProfiler Analyst images via URL)
The image paths written to the database will be the absolute path the the image files on your computer. If you plan to make these files accessible via the web, you can enter a url prefix here. Eg: If an image is loaded from the path "/cellprofilerhttp://d1zymp9ayga15t.cloudfront.net/images/" and you use a url prepend of "http://mysite.com/", CellProfiler Analyst will look for your file at "http://mysite.com/cellprofilerhttp://d1zymp9ayga15t.cloudfront.net/images/"

If you are not using the web to access your files (i.e., they are locally aceesible by your computer), leave this setting blank.

Select the plate type

(Used only if creating a properties file)
If you are using a multi-well plate or microarray, you can select the plate type here. Supported types in CellProfiler Analyst are 96- and 384-well plates, as well as 5600-spot microarrays. If you are not using a plate or microarray, select None.

Select the plate metadata

(Used only if creating a properties file)
If you are using a multi-well plate or microarray, you can select the metadata corresponding to the plate here. If there is no plate metadata associated with the image set, select None.

Please see the Metadata module for more details on metadata collection and usage.

Select the well metadata

(Used only if creating a properties file)
If you are using a multi-well plate or microarray, you can select the metadata corresponding to the well here. If there is no well metadata associated with the image set, select None.

Please see the Metadata module for more details on metadata collection and usage.

Include information for all images, using default values?

(Used only if creating a properties file)
Select Yes to include information in the properties file for all images. This option will do the following:

All images loaded using the Input modules or saved in SaveImages will be included.
The CellProfiler image name will be used for the image_name field.
A channel color listed in the image_channel_colors field will be assigned to the image by default order.

Select No to specify which images should be included or to override the automatic values.

Select an image to include

(Used only if creating a properties file and specifiying the image information)
Choose image name to include it in the properties file of images.

The images in the drop-down correspond to images that have been:

Loaded using one of the Load modules.
Saved with the SaveImages module, with the corresponding file and path information stored.

If you do not see your desired image listed, check the settings on these modules.

Use the image name for the display?

(Used only if creating a properties file and specifiying the image information)
Select Yes to use the image name as given above for the displayed name.

Select No to name the file yourself.

Image name

(Used only if creating a properties file, specifiying the image information and naming the image)
Enter a name for the specified image

Channel color

(Used only if creating a properties file and specifiying the image information)
Enter a color to display this channel.

Do you want to add group fields?

(Used only if creating a properties file)
Please note that "groups" as defined by CellProfiler Analyst has nothing to do with "grouping" as defined by CellProfiler in the Groups module.

Select Yes to define a "group" for your image data (for example, when several images represent the same experimental sample), by providing column(s) that identify unique images (the image key) to another set of columns (the group key).

The format for a group in CPA is:
group_SQL_<XXX> = <MySQL SELECT statement that returns image-key columns followed by group-key columns> For example, if you wanted to be able to group your data by unique plate names, you could define a group called SQL_Plate as follows:
group_SQL_Plate = SELECT ImageNumber, Image_Metadata_Plate FROM Per_Image

Grouping is useful, for example, when you want to aggregate counts for each class of object and their scores on a per-group basis (e.g., per-well) instead of on a per-image basis when scoring with Classifier. It will also provide new options in the Classifier fetch menu so you can fetch objects from images with specific values for the group columns.

Enter the name of the group

(Used only if creating a properties file and specifiying an image data group)
Enter a name for the group. Only alphanumeric characters and underscores are permitted.

Enter the per-image columns which define the group, separated by commas

(Used only if creating a properties file and specifiying an image data group)
To define a group, enter the image key columns followed by group key columns, each separated by commas.

In CellProfiler, the image key column is always given the name as ImageNumber; group keys are typically metadata columns which are always prefixed with Image_Metadata_. For example, if you wanted to be able to group your data by unique plate and well metadata tags, you could define a group with the following MySQL statement:
group_SQL_Plate = SELECT ImageNumber, Image_Metadata_Plate, Image_Metadata_Well FROM Per_Image
For this example, the columns to enter in this setting would be:
ImageNumber, Image_Metadata_Plate, Image_Metadata_Well

Groups are specified as MySQL statements in the properties file, but please note that the full SELECT and FROM clauses will be added automatically, so there is no need to enter them here.

Do you want to add filter fields?

(Used only if creating a properties file)
Select Yes to specify a subset of the images in your experiment by defining a filter. Filters are useful, for example, for fetching and scoring objects in Classifier or making graphs using the plotting tools that satisfy a specific metadata contraint.

Automatically create a filter for each plate?

(Used only if creating a properties file and specifiying an image data filter)
If you have specified a plate metadata tag, selecting Yes to create a set of filters in the properties file, one for each plate.

Enter a phenotype class table name if using the classifier tool

(Used only if creating a properties file)
If you are using the machine-learning tool in CellProfiler Analyst, you can create an additional table in your database which contains the per-object phenotype labels. This table is produced after scoring all the objects in your data set and will be named with the label given here.

You can manually change this choice in the properties file by edting the class_table field. Leave this field blank if you are not using the classifier or do not need the table written to the database.

Output file location

(Used only when using a CSV or a SQLite database, and/or creating a properties or workspace file)
This setting determines where the CSV files or SQLite database is saved if you decide to write measurements to files instead of writing them directly to the database. If you request a CellProfiler Analyst properties file or workspace file, it will also be saved to this location. You can choose among the following options which are common to all file input/output modules:

Default Input Folder: Use the default input folder.
Default Output Folder: Use from the default output folder.
Elsewhere...: Use a particular folder you specify.
Default input directory sub-folder: Enter the name of a subfolder of the default input folder or a path that starts from the default input folder.
Default output directory sub-folder: Enter the name of a subfolder of the default output folder or a path that starts from the default output folder.

Elsewhere and the two sub-folder options all require you to enter an additional path name. You can use an absolute path (such as "C:\imagedir\image.tif" on a PC) or a relative path to specify the file location relative to a directory):

Use one period to represent the current directory. For example, if you choose Default Input Folder sub-folder, you can enter "./MyFiles" to look in a folder called "MyFiles" that is contained within the Default Input Folder.
Use two periods ".." to move up one folder level. For example, if you choose Default Input Folder sub-folder, you can enter "../MyFolder" to look in a folder called "MyFolder" at the same level as the Default Input Folder.

For Elsewhere..., Default Input Folder sub-folder and Default Output Folder sub-folder, if you have metadata associated with your images via Metadata module, you can name the folder using metadata tags. You can insert a previously defined metadata tag by either using:

The insert key
A right mouse button click inside the control
In Windows, the Context menu key, which is between the Windows key and Ctrl key

The inserted metadata tag will appear in green. To change a previously inserted metadata tag, navigate the cursor to just before the tag and either:

Use the up and down arrows to cycle through possible values.
Right-click on the tag to display and select the available values.

For instance, if you have a metadata tag named "Plate", you can create a per-plate folder by selecting one of the subfolder options and then specifying the subfolder name with the "Plate" metadata tag. The module will substitute the metadata values for the last image set processed for any metadata tags in the folder name. Please see the Metadata module for more details on metadata collection and usage.

Create a CellProfiler Analyst workspace file?

Select Yes to generate a workspace file for use with CellProfiler Analyst, a data exploration tool which can also be downloaded from http://www.cellprofiler.org/. A workspace file allows you to open a selected set of measurements with the display tools of your choice. This is useful, for example, if you want examine a standard set of quality control image measurements for outliers.

Select the measurement display tool

(Used only if creating a workspace file)
Select what display tool in CPA you want to use to open the measurements.

ScatterPlot
Histogram
DensityPlot
PlateViewer
BoxPlot

Type of measurement to plot on the X-axis

(Used only if creating a workspace file)
You can plot two types of measurements:

Image: For a per-image measurement, one numerical value is recorded for each image analyzed. Per-image measurements are produced by many modules. Many have MeasureImage in the name but others do not (e.g., the number of objects in each image is a per-image measurement made by IdentifyObject modules).
Object: For a per-object measurement, each identified object is measured, so there may be none or many numerical values recorded for each image analyzed. These are usually produced by modules with MeasureObject in the name.

Enter the object name

(Used only if creating a workspace file)
Select the object that you want to measure from the list. This should be an object created by a previous module such as IdentifyPrimaryObjects, IdentifySecondaryObjects, or IdentifyTertiaryObjects.

Select the X-axis measurement

(Used only if creating a workspace file)
Select the measurement to be plotted on the desired axis.

Type of measurement to plot on the Y-axis

(Used only if creating a workspace file)
You can plot two types of measurements:

Image: For a per-image measurement, one numerical value is recorded for each image analyzed. Per-image measurements are produced by many modules. Many have MeasureImage in the name but others do not (e.g., the number of objects in each image is a per-image measurement made by IdentifyObject modules).
Object: For a per-object measurement, each identified object is measured, so there may be none or many numerical values recorded for each image analyzed. These are usually produced by modules with MeasureObject in the name.

Select the Y-axis measurement

(Used only if creating a workspace file)
Select the measurement to be plotted on the desired axis.

Calculate the per-image mean values of object measurements?

Select Yes for ExportToDatabase to calculate population statistics over all the objects in each image and store the results in the database. For instance, if you are measuring the area of the Nuclei objects and you check the box for this option, ExportToDatabase will create a column in the Per_Image table called "Mean_Nuclei_AreaShape_Area".

You may not want to use ExportToDatabase to calculate these population statistics if your pipeline generates a large number of per-object measurements; doing so might exceed database column limits. These columns can be created manually for selected measurements directly in MySQL. For instance, the following SQL command creates the Mean_Nuclei_AreaShape_Area column:

ALTER TABLE Per_Image ADD (Mean_Nuclei_AreaShape_Area); UPDATE Per_Image SET Mean_Nuclei_AreaShape_Area = (SELECT AVG(Nuclei_AreaShape_Area) FROM Per_Object WHERE Per_Image.ImageNumber = Per_Object.ImageNumber);

Calculate the per-well mean values of object measurements?

Select Yes for ExportToDatabase to calculate statistics over all the objects in each well and store the results as columns in a "per-well" table in the database. For instance, if you are measuring the area of the Nuclei objects and you check the aggregate mean box in this module, ExportToDatabase will create a table in the database called "Per_Well_avg", with a column called "Mean_Nuclei_AreaShape_Area". Selecting all three aggregate measurements will create three per-well tables, one for each of the measurements.

The per-well functionality will create the appropriate lines in a .SQL file, which can be run on your Per-Image and Per-Object tables to create the desired per-well table.

Note: this option is only available if you have extracted plate and well metadata from the filename using the Metadata or LoadData modules. It will write out a .sql file with the statements necessary to create the Per_Well table, regardless of the option chosen above. Please see the Metadata module for more details on metadata collection and usage

Calculate the per-well median values of object measurements?

Select Yes for ExportToDatabase to calculate statistics over all the objects in each well and store the results as columns in a "per-well" table in the database. For instance, if you are measuring the area of the Nuclei objects and you check the aggregate median box in this module, ExportToDatabase will create a table in the database called "Per_Well_median", with a column called "Median_Nuclei_AreaShape_Area". Selecting all three aggregate measurements will create three per-well tables, one for each of the measurements.

The per-well functionality will create the appropriate lines in a .SQL file, which can be run on your Per-Image and Per-Object tables to create the desired per-well table.

Calculate the per-well standard deviation values of object measurements?

Select Yes for ExportToDatabase to calculate statistics over all the objects in each well and store the results as columns in a "per-well" table in the database. For instance, if you are measuring the area of the Nuclei objects and you check the aggregate standard deviation box in this module, ExportToDatabase will create a table in the database called "Per_Well_std", with a column called "Mean_Nuclei_AreaShape_Area". Selecting all three aggregate measurements will create three per-well tables, one for each of the measurements.

The per-well functionality will create the appropriate lines in a .SQL file, which can be run on your Per-Image and Per-Object tables to create the desired per-well table.

Export measurements for all objects to the database?

This option lets you choose the objects whose measurements will be saved in the Per_Object and Per_Well(s) database tables.

All: Export measurements from all objects.
None: Do not export data to a Per_Object table. Save only Per_Image or Per_Well measurements (which nonetheless include population statistics from objects).
Select...: Select the objects you want to export from a list.

Select the objects

(Used only if Select is chosen for adding objects)
Choose one or more objects from this list (click using shift or command keys to select multiple objects). The list includes the objects that were created by prior modules. If you choose an object, its measurements will be written out to the Per_Object and/or Per_Well(s) tables, otherwise, the object's measurements will be skipped.

Create one table per object, a single object table or a single object view?

ExportToDatabase can create either one table for each type of object exported or a single object table.

One table per object type creates one table for each object type you export. The table name will reflect the name of your objects. The table will have one row for each of your objects. You can write SQL queries that join tables using the "Number_ObjectNumber" columns of parent objects (such as those created by IdentifyPrimaryObjects) with the corresponding "Parent_... column" of the child objects. Choose One table per object type if parent objects can have more than one child object, if you want a relational representation of your objects in the database, or if you need to split columns among different tables and shorten column names because of database limitations.
Single object table creates a single database table that records the object measurements. ExportToDatabase will prepend each column name with the name of the object associated with that column's measurement. Each row of the table will have measurements for all objects that have the same image and object number. Choose Single object table if parent objects have a single child, or if you want a simple table structure in your database. You can combine the measurements for all or selected objects in this way.
Single object view creates a single database view to contain the object measurements. A view is a virtual database table which can be used to package together multiple per-object tables into a single structure that is accessed just like a regular table. Choose Single object view if you want to combine multiple objects but using Single object table would produce a table that hits the database size limitations.
An important note is that only objects that are related as primary, secondary or tertiary objects to each other should be combined in a view. This is because the view expects a one-to-one relationship between the combined objects. If you are selecting objects for the view, the module will warn you if they are not related in this way.

Maximum # of characters in a column name

This setting limits the number of characters that can appear in the name of a field in the database. MySQL has a limit of 64 characters per field, but also has an overall limit on the number of characters in all of the columns of a table. ExportToDatabase will shorten all of the column names by removing characters, at the same time guaranteeing that no two columns have the same name.

Write image thumbnails directly to the database?

(Used only if MySQL or SQLite are selected as database type)
Select Yes if you'd like to write image thumbnails directly into the database. This will slow down the writing step, but will enable new functionality in CellProfiler Analyst such as quickly viewing images in the Plate Viewer tool by selecting "thumbnail" from the "Well display" dropdown.

Select the images for which you want to save thumbnails

(Used only if MySQL or SQLite are selected as database type and writing thumbnails is selected)
Select the images that you wish to save as thumbnails to the database. Make multiple selections by using Ctrl-Click (Windows) or Command-Click (Mac);

Auto-scale thumbnail pixel intensities?

(Used only if MySQL or SQLite are selected as database type and writing thumbnails is selected)
Select Yes if you'd like to automatically rescale the thumbnail pixel intensities to the range 0-1, where 0 is black/unsaturated, and 1 is white/saturated.

Module: ExportToDatabase

Available measurements

Settings:

Database type

Experiment name

Database name

SQL file prefix

Name the SQLite database file

Overwrite without warning?

Add a prefix to table names?

Table prefix

Create a CellProfiler Analyst properties file?

Which objects should be used for locations?

Access CPA images via URL?

Enter an image url prepend if you plan to access your files via http

Select the plate type

Select the plate metadata

Select the well metadata

Include information for all images, using default values?

Select an image to include

Use the image name for the display?

Image name

Channel color

Do you want to add group fields?

Enter the name of the group

Enter the per-image columns which define the group, separated by commas

Do you want to add filter fields?

Automatically create a filter for each plate?

Enter a phenotype class table name if using the classifier tool

Output file location

Create a CellProfiler Analyst workspace file?

Select the measurement display tool

Type of measurement to plot on the X-axis

Enter the object name

Select the X-axis measurement

Type of measurement to plot on the Y-axis

Select the Y-axis measurement

Calculate the per-image mean values of object measurements?

Calculate the per-well mean values of object measurements?

Calculate the per-well median values of object measurements?

Calculate the per-well standard deviation values of object measurements?

Export measurements for all objects to the database?

Select the objects

Create one table per object, a single object table or a single object view?

Maximum # of characters in a column name

Write image thumbnails directly to the database?

Select the images for which you want to save thumbnails

Auto-scale thumbnail pixel intensities?