Dev4Snow

Go back to ContentsGo back to previous page

Scan Data Model

Scan Data Model is a feature based on Computer Vision which combines state of the art OCR techniques, spatial analysis and pattern recognition to convert Data Model images into tables. The OCR algorithm is capable of recognizing handwritten and printed text to capture data from a whiteboard, from paper or from a screen -using a mobile phone, tablet or from the desktop computer.

To start Scan Data Model click on the option with the same name from the tree, or from the  icon in the Data Modeler tab. A Scan Data Model session will be generated if there’s no previous session for the combination of Snowflake account and user, or the last open session will be retrieved if it’s available:

Image Gathering tab

  • Use your mobile phone or tablet to scan the QR code, visit that link and use the web app to take pictures of your Data Models –from a whiteboard, paper or screen- which will be immediately uploaded to Dev4Snow website, analyzed and made available for Dev4Snow tool. Click “Refresh” on the Image Gathering tab or “Refresh list” on Scanned Images tab to update your list of images.
  • Click “Computer Camera” to open the desktop browser in the Dev4Snow URL that allows you to take pictures using your computer camera, which will be uploaded, analyzed and made available to Dev4Snow. Click “Refresh” on the Image Gathering tab or “Refresh list” on Scanned Images tab to update your list of images.
  • Click “Upload file(s)” to upload one or more images located in your computer.
  • Click “Read from Clipboard” to import the last image copied to your clipboard.
  • Click “Close session” and confirm the action to permanently delete your scanned images and close the session.

Import Settings

These options will apply in the Scanned Images tab when importing uploaded images into the Data Modeler:

  • Convert texts to Upper case and replace non-standard symbols: Table names and Columns will be automatically converted to Upper case, and invalid symbols will be converted to underscores (“_”).
    • Camel Case to Snake Case (optional): Besides converting to upper case, if Camel Case was used this option will convert those into Snake Case (E.g. “CustomerID” will be converted into “CUSTOMER_ID”).
       
  • Accept Lower case and Symbols, adding Quotation marks if needed: Lower case and symbols are accepted. If the text is all in Upper case it will be returned as it is, otherwise it will be returned enclosed in double quotation marks.
     
  • Proximity default %: This number defines the proximity distance percentage default -of the image size- in which the texts are considered to be part of the same object. For example 8% means every text near another text –in a distance that is equal or less than 8% -relative to the image size- will be considered to be part of the same object. The 8% distance is the initial default, but you can change it to a different number if it works better for you, or you can update it for a particular image in the Scanned Images tab.

 

Scanned Images tab

All the pictures taken for this session will show up in this tab –except the ones that were deleted- and you can import them into the Data Modeler one by one or multiple at a time by selecting them from the list.

Actions available for this tab:

  • Refresh list: After taking new pictures with a mobile phone or tablet, click this button to refresh the list of images. Images are kept in the server for 7 days and then deleted permanently, so refresh the list and click on each of the new images to download them from the server after taking new pictures. Pictures downloaded to your computer will remain in it permanently until you delete them.
  • Proximity: Change this number to define the proximity distance percentage, in which the texts are considered to be part of the same object. For example 8% proximity means every text near another text –in a distance that is equal or less than 8% -relative to the image size- will be considered to be part of the same object. Rectangles or lines around objects are not taken into account; only the position of texts is used to detect objects.
  • Reset to default: Click this button to delete all the object selections and reapply the selected proximity to the diagram.
  • Manual object selection: When the automatic object detection fails for one or more objects, try to adjust the proximity % until all of the objects are correctly selected. If changing the proximity does not generate the correct result select the object boundaries manually by clicking on an object corner and dragging the mouse to the opposite corner. To de-select an existing object just click on it.
  • Import tables from current image: After verifying that the objects are correctly selected, click this button to import all of its tables to the Data Modeler.
  • Import tables from selected images: For importing tables from multiple images to the Data Modeler, select the images using the checkboxes and click this button.
  • Select/Deselect All: Click this button to select/deselect all the images in the list.
  • Delete selected rows: Click this button to delete the selected images. A confirmation is required.

 

Importing images

After clicking Import tables from current image or Import tables from selected images you will be presented with a screen like this one below:

In this screen you can edit the table name, column names and data types for each of the tables. After making the necessary changes for a table click OK to import the table into the Data Modeler and advance to the next table. Alternatively click OK to all to automatically import the current table and the rest of the tables without editing anything –the process will stop if a table already exists or if there’s an invalid value and it will allow you to edit and continue.

Click Skip this object if you want to skip the current table and move to the next one.

Click Cancel if you want to cancel the import process for the current table and the rest of the tables.

 

Table accepted patterns

The following are some examples of accepted patterns for tables that Dev4Snow will be able to recognize: 

Data Types

Most of the existing data types in common databases will be recognized by Dev4Snow and automatically converted into Snowflake accepted data types. If a data type is not recognized a VARCHAR data type will be used. If the data type is missing Dev4Snow will try to infer the data type by the column name using these rules in this order:

  1. Column names containing the string DESCRIPTION(S), DESC or TEXT => VARCHAR(255)
  2. Column names containing the string DATETIME/TIMESTAMP/CREATED => TIMESTAMP_NTZ
  3. Column names containing the string DATE => DATE
  4. Column names containing the string TIME => TIME
  5. Column names containing the string PRICE/AMOUNT/WEIGHT/HEIGHT/WIDTH/LENGTH => NUMBER(18,3)
  6. Column names containing the string QUANTITY/QTY/STOCK => NUMBER(10)
  7. Column names containing the string _ID or named “ID” => NUMBER(10)

If the data type cannot be inferred a VARCHAR data type will be used.

 

Known limitations

  • Dev4Snow will recognize tables but not relationships between tables. After importing the tables into the Data Modeler you can easily create table relationships by dragging arrows from the source tables –the ones with the foreign keys- into the destination tables –the ones where those foreign keys are part of the Primary Key. If the destination tables do not define Primary Keys you will have to create them before doing it.
  • If you are taking pictures from a whiteboard or paper try to reduce the picture angle of the frame to get better results.