Skip to content

Latest commit



1360 lines (1161 loc) · 35.2 KB

File metadata and controls

1360 lines (1161 loc) · 35.2 KB
<style type="text/css"> .apiok{ color:#04B486 } .apitesting{ color:#ff8900 } .methodok{ color:#01A9DB } .extend{ color:#9370DB } .building{ color:#FCCF46 } .scheduled{ color:#BDBDBD } </style>

InCore Ops Document

About Document

Tested API Untest API Building API Tested Method Extendable Method Scheduled

About Package


      ![mysql]( =103x70)


      ![flask](*0G5zu7CnXdMT9pGbYUTQLQ.png =90x50)


      ![numpy]( =126x50)    ![panda]( =240x)


      ![bokeh]( =100x50)    ![matplotlib]( =208x50)


      ![sklearn]( =93x50)        ![keras]( =172x50) ![Tensorflow](*37N7BHNaEsXPaerNQ8wBdA.png =230x70)


  • EngineMYSQL
  • Connection in this project
    from utils import sql
    db.cursor.execute("some sql here")
    db.cursor.execute("select * from ...")
    data=[[dd for dd in d] for d in db.conn.fetchall()]
  • Tables and columns

Before Run

  • run python
  • clone the repository
  • build the database
  • Under root folder, create db, images, log, models, tmp foler
  • Under root folder, create a secret.cfg which is the config of the system
    secretkey=the_Secret_Key(can be any string)


  • Under root folder, run python3

About Maintain

Two ways to enable debug mode (auto clone, auto restart)

  • Under root folder, run python3 --debug
  • Change self.maintaining to True in src/

About Token

When requesting a service, the request should contain a jwt token token to validate the request source.

Contact DevOps for the audience in jwt.

The validator is tokenValidator in src/

Parameters setting

  • file location: src/
  • Description: File folder, model folder, acceptable files, acceptable projects are defined in this file.
  • Usage: In each py
    • use from params import params to import
    • create instance param=params()
    • Then get var with param.var

System variable API

  • Folder location: src/controller/
  • Description: get system settings
[API] getDataProject
  • Description: Get supported project of one dataType

  • ==Usage==: GET http://host/sys/dataproject

        "status": Enum("success","error"),
        "msg": "error msg",
        "data": {
            "num": [
            "cv": [
            "nlp": [
[API] getDataExtension
  • Description: Get supported extension of one dataType

  • ==Usage==: GET http://host/sys/dataextension with param

        "datatype": Enum("num", "cv", "nlp")

    and get response

        "status": Enum("success", "error")
        "msg": "error msg",
        "data": {
                ".ext1", ".ext2" ....

Data collection service

  • Folder location: src/service/dataService
  • Description: this service contains upload, download, getColumn, getFileStatus, delete API
[API] dataUpload
  • File location: src/service/dataService/controller/

  • Description: This py is a upload API. When uploading a file, the service will check the file type and project type. Then generate a file UID. After that, the service will check the file content with checkers in src/resources/dataService/

  • ==Usage==: POST http://host/data/upload with a form

        "file": binaryFile,
        "type": "dataType ( num/cv/nlp)",
        "token": "token_string"

    and get a response

        "status": Enum("success", "error"),
        "msg": "error_msg",
            "fileUid": "the_generated_file_uid"
  • Acceptable file types and their rules:

    • Numerical project: A csv with column name and their values. The values should be numerical data (classifiable text will be supported in v2.0). For example:
    • NLP project: A tsv with column name. For project with label, there should be at least one column which contains the numerical value. For example:
      Sentence1	value	value2
      I am happy	1	1
      I am sad	0	0
      Sentence1	Sentence2	value
      I am happy	So am I :)	1
      I am happy	I am a student	0
    • CV project: A zip file. There should be a (only one) csv file in the zip directly, not in a folder. For project with label, here should be at least one column that contains the numerical value. Other columns are the image file path (related path in zip). For example:
          |    |--imga.jpg
          |    |--imgb.png
          |    |--imgk.JPEG
          |    |--imgl.png
      and the csv is
[API] dataDownload
  • File location: src/service/dataService/controller/

  • Description: Download file

  • ==Usage==: GET http://host/data/download with a form

        "fileUid": "file_id"

    and get a binary response

[API] dataDelete
  • File location: src/service/dataService/controller/

  • Description: Delete file

  • ==Usage==: POST http://host/data/delete with a form

        "fileUid": "file_id",
        "token": "token_string"

    get a json

        "status": Enum("success", "error"),
        "msg": "error_msg",
[API] getColumn
  • File location: src/service/dataService/controller/

  • Description: Get column names and types

  • ==Usage==: POST http://host/data/getcol with a form

        "fileUid": "file_id",
        "token": "token_string"

    get a json

        "status": Enum("success", "error"),
        "msg": "error_msg",
                    "name": "col1_name",
                    "type": Enum("int", "float", "path", "string"),
                    "classifiable": Enum(1,0)
[API] getFileStatus
  • File location: src/service/dataService/controller/

  • Description: Get file (batch) status

  • ==Usage==: POST http://host/data/getstatus with a form

        "fileUid":(a json list string) "["file_id1", "file_id2"]",
        "token": "token_string"

    get a json

        "status": Enum("success", "error"),
        "msg": "error_msg",
            "status":[Enum(0, 1), Enum(0, 1)]

    0 for not in-use, 1 for in-use

[Method] fileUidGenerator
  • File location: src/service/dataService/

  • Description: Generate unique file id

  • Usage:

    from service.dataService.utils import fileUidGenerator
[Method] fileChecker
  • File location: src/service/dataService/

  • Description: Validate file content

  • Usage:

    from service.dataService.utils import fileChecker
[Method] getColType
  • File location: src/service/dataService/

  • Description: Get column names and type

  • Usage:

    from service.dataService.utils import getColType

    This is how coltype looks like:

            'type':col1_type Enum("int", "float", "string", "path"),
            'classifiable': Enum(1, 0)
            'type':col1_type Enum("int", "float", "string", "path"),
            'classifiable': Enum(1, 0)
[Method] getDf
  • File location: src/service/dataService/

  • Description: Get column names and type

  • Usage:

    from service.dataService.utils import getColType

    Data is a dataframe.

Visualize Service

  • filelocation: src/resources/visualizationService
  • Description: Use bokeh to show data and image. If the data is not supported by bokeh, it will return the image result of matplotlib and shown by bokeh.
    For showing bokeh with js, please refer to section 2 of this article.
[API] getDataVizAlgoList
  • File location: src/service/visualizeService/controller/

  • Description: get data visualize algorithm

  • ==Usage==: GET http://host/viz/data/getalgo

    get a json

            "status": Enum("success", "error"),
            "msg": "error_msg",
                            "lib":"used lib",
                                "x":"float", -> need a x col and must be float or int
                                "y":"string", -> need a y col and must be string
                                "value":"none" -> not needed
                            "description":"2D line plot" -> algo description
                            "lib":"used lib",
                                "x":"int", -> need a x col and must be int
                                "y":"path", -> need a y col and must be path
                                "value":"float" -> need a value col and must be float
                            "description":"2D scatter plot" -> algo description
[API] dataViz
  • File location: src/service/visualizeService/controller/

  • Description: Visualizing data

  • ==Usage==: POST http://host/viz/data/do with param

        "fileUid": "fileID",
        "algoname": "algoname",
        "datacol": (a json string) "{
        "token": "token_string"

    and get a response

        "status": Enum("success", "error"),
        "msg": "error_msg",
            "div": "div of bokeh",
            "script": "script of bokeh"
[API] getImg
  • File location: src/service/visualizeService/controller/

  • Description: Get binary img

  • ==Usage==: GET http://host/viz/getimg with param

  • BaseClass File location: src/service/visualizeService/core/

Analytic Service

Preprocess and Data info

[CORE] missingFiltering
  • File location: src/service/analuticService/core/preprocess/
  • Discription: Filt missing value of number, string and path.
  • ==Usage==:
    • filtCols get a filted data


      • data: A 2D array of data
      • coltype: colType Enum("int", "float", "string", "path")
      • doList: filt the column or not
      • pathBase: the base folder of CV file. (OPTIONAL)
          path_of_file (cv folder)

      get a 2D array of filted data

    • getRetainIndex get which row to be retained


      • data: A 2D array of data to check missing value
      • coltype: their colType
      • pathBase: the base folder of CV file. (OPTIONAL)
          path_of_file (cv folder)

      get a 1D np array of retain or not

[CORE] [EXTENDABLE] normalize
  • File location: src/service/analuticService/core/preprocess/

  • Discription: Normalize the column

  • ==Usage==:

    Call imeplemented algo class, do to normalize


    • data: A 1D array of data

    get a 1D array of normalized data

[CORE] [EXTENDABLE] outlierFiltering
  • File location: src/service/analuticService/core/preprocess/

  • Discription: Filt outlier

  • ==Usage==:

    Call imeplemented algo class, getRetainIndex get which row to be retained


    • data: A 1D array of data to check missing value

    get a 1D np array of retain or not

[CORE] [EXTENDABLE] stringCleaning
  • File location: src/service/analuticService/core/preprocess/

  • Discription: Clean string

  • ==Usage==:

    Call imeplemented algo class, do to get a clean string


    • data: A string

    get a 1D np array of retain or not

[CORE] [EXTENDABLE] correlation
  • File location: src/service/analuticService/core/

  • Discription: Clean string

  • ==Usage==:

    Call imeplemented algo class, do to get a clean string


    • fileUid

    get a correlation dataframe and its bokeh div and script

        "div": "bokeh div",
        "script": "bokeh script",
        "dataframe": dataframe
[API] getPreprocessAlgo
  • File location: src/service/analyticService/controller/

  • Description: get preprocess algorithm list

  • ==Usage==: GET http://host/preprocess/getalgo

    get a json

        "status": Enum("success", "error"),
        "msg": "error msg",
        "data": {
            "normalize": [
                    "friendlyname": "Min-Max to 0~1",
                    "algoname": "minmax01"
            "outlierFiltering": [
                    "friendlyname": "1st standard deviation ",
                    "algoname": "std1"
                    "friendlyname": "2nd standard deviation ",
                    "algoname": "std2"
                    "friendlyname": "3rd standard deviation ",
                    "algoname": "std3"
            "stringCleaning": [
                    "friendlyname": "remove punctuation",
                    "algoname": "punctuation"
[API] doPreprocess
  • File location: src/service/analyticService/controller/

  • Description: preprocess a file and save it to another file

  • ==Usage==: POST http://host/preprocess/do with param

        "fileUid": "fileID",
        "action": (a json string) "[
                "missingFiltering": "0", -> no needed
                "outliterFiltering": "0", -> no needed
                "normalize": "0", -> no needed
                "stringCleaning": ["0"] -> no needed
                "missingFiltering": "1",  -> filt missing value
                "outliterFiltering": "algoname", -> use algoname to filt outlier
                "normalize": "algoname", -> normalize using algoname
                "stringCleaning": ["0"] -> a numerical column, no needed
                "missingFiltering": "1",
                "outliterFiltering": "0", -> string column, no needed
                "normalize": "0", -> string column, no needed
                "stringCleaning": ["algoname1","algoname2"] -> use algo1 and algo2 to clean the string
        "token": "token"

    and get a response

        "status": Enum("success", "error"),
        "msg": "error_msg",
            "fileUid": "uid of file after preprocessing"
[API] previewPreprocess
  • File location: src/service/analyticService/controller/

  • Description: Preview the result of preprocessed ==numerical== column (classifiable text will be supported in v2.0)

  • ==Usage==: POST http://host/preprocess/preview with param

        "fileUid": "fileID",
        "action": (a json string) "[
                "missingFiltering": "0",
                "outliterFiltering": "0",
                "normalize": "minmax01",
                "stringCleaning": ["0"]
        "token": "token_string"

    and get a response

    1. if figure is plotted:
        "status": "success" or "error",
        "msg": "error_msg",
            "msg":"preprocess result message",
                "div":"the bokeh div of before",
                "script":"the bokeh script of before"
                "div":"the bokeh div of after",
                "script":"the bokeh script of after"
    1. if no figure is plotted:
        "status": "success" or "error",
        "msg": "error_msg",
            "msg":"preprocess result message",
[API] getCorrelationAlgoList
  • File location: src/service/analyticService/controller/

  • Description: get data correlation algorithm

  • ==Usage==: GET http://host/correlation/getalgo

    get a json

        "status": "success",
        "msg": "",
        "data": [
                "friendlyname": "Pearson Correlation",
                "algoname": "pearson"
[API] doCorrelation
  • File location: src/service/analyticService/controller/

  • Description: get correlation of a data (ONLY FOR NUM PROJECT)

  • ==Usage==: POST http://host/correlation/do with param

        "token": "token"
        "fileUid": "fileID",
        "algoname": "the algo name from getCorrelationAlgo response"

    and get a response

        "status": "success" or "error",
        "msg": "error_msg",
            "div": "bokeh div",
            "script": "bokeh script"


[CORE][EXTEND] Analytic Core
digraph hierarchy {

                //nodesep=1.0 // increases the separation between nodes
                node [color=Red,fontname=Courier,shape=box] //All nodes will this shape and colour
                edge [color=Blue, style=dashed] //All the lines look like this

                analyticBase->{regressionBase classificationBase abnormalBase clusteringBase}
                regressionBase->{regAlgo1 regAlgo2 regAlgoN}
                classificationBase->{claAlgo1 claAlgo2 claAlgoN}
                abnormalBase->{abnAlgo1 abnAlgo2 abnAlgoN}
                clusteringBase->{cluAlgo1 cluAlgo2 cluAlgoN}
  • All classes are child classes of analyticBase
  • Four child PROJECT classes regressionBase, classificationBase, abnormalBase, clusteringBase are designed to adapt each kind of training purpose
  • To train:
    alg=algo(algoInfo,fileID,'train') # algoInfo is defined in doModelTrain
  • To predict:
  • To test:
        "text": "The testing result",
                "div": "bokeh div",
                "script": "bokeh script"
                "div": "bokeh div",
                "script": "bokeh script"
  • To develop a new PROJECT class, implement:
    • Test: Generate testing result (loss, accuracy....) as text by self.outputData and self.result. Save the string to self.txtRes
    • projectVisualize: Generate visualization of model and result by self.outputData, self.result, self.model. The bokeh figures should be saved to self.vizRes as
              "div": "bokeh div",
              "script": "bokeh script"
              "div": "bokeh div",
              "script": "bokeh script"
[API] getAnalyticAlgoList
  • File location: src/service/analyticService/controller/

  • Description: get analytic algorithm list

  • ==Usage==: GET http://host/analytic/getalgo with param

        "dataType": "cv",
        "projectType": "classification"

    get a response

        "status": "status",
        "msg": "error msg",
        "data": ["algo1","algo2"]
[API] getAnalyticAlgoParam
  • File location: src/service/analyticService/controller/

  • Description: get parameter of an analytic algorithm

  • ==Usage==: GET http://host/analytic/getparam with param

        "algoName": "algonameYouWantToKnow"

    get a response

        "status": "status",
        "msg": "error msg",
        "data": {
            "dataType": "num",
            "algoName": "algonameYouWantToKnow",
            "description": "the description"
            "lib":"sklearn" / "keras",
                    "name": "param1Name",
                    "description": "param1 Description",
                    "type": "int",
                    "upperBound": upperBound,
                    "lowerBound": lowerBound,
                    "name": "param2Name",
                    "description": "param2 Description",
                    "type": "float",
                    "upperBound": upperBound,
                    "lowerBound": lowerBound,
                    "name": "param3Name",
                    "description": "param3 Description",
                    "type": "bool",
                    "name": "param4Name",
                    "description": "param4 Description",
                    "type": "enum",
                    "list": ["option1","option2","option3"],
                    "name": "param5Name",
                    "description": "param5 Description",
                    "type": "string",
                    "default":"default string
                    "name": "input1Name",
                    "description": "input1 description",
                    "type": "float",
                    "amount": "multiple",
                    "name": "input2Name",
                    "description": "input2 description",
                    "type": "classifiable",
                    "amount": "single"
                    "name": "input3Name",
                    "description": "input3 description",
                    "type": "string",
                    "amount": "single"
                    "name": "input4Name",
                    "description": "input4 description",
                    "type": "path",
                    "amount": "single"
                    "name": "output1Name",
                    "description": "output1 description",
                    "type": "float"
                    "name": "output2Name",
                    "description": "output2 description",
                    "type": "classifiable"
                    "name": "output3Name",
                    "description": "output3 description",
                    "type": "string"
                    "name": "output4Name",
                    "description": "output4 description",
                    "type": "path"
            ] # For unsupervised project, its a empty list
[API] doModelTrain
  • File location: src/service/analyticService/controller/
  • Description: perform a model training
  • ==Usage==: POST http://host/analytic/train with param (algoInfo in core)
        "token": "token",
        "fileUid": "file id",
        "dataType": "num",
        "projectType": "classification",
        "algoName": "the algoname from getAlgoList",
        "param": (A json string) "{
            "param1Name" : 0.87,      # float example
            "param2Name" : 30,        # int example
            "param3Name" : 0,       # bool example
            "param4Name" : "option1", # enum example
            "param5Name" : "string"   # string example
        "input": (A json string) "{
            "input1Name" : ["col1","col2"], # multiple input example
            "input2Name" : ["col3"],         # single input example
        "output": (A json string. Pass "{}" for unsupervised project) "{
            "output1Name" : "col4",
            "output2Name" :  "col5"
    and get a response
        "status": "success"/"error",
        "msg": "error msg",
        "data": {
            "modelUid": "modelUid"
[API] stopTraining
  • File location: src/service/analyticService/controller/

  • Description: stop a model training

  • ==Usage==: DELETE http://host/analytic/stop with form

        "token": "token",
        "modelUid": "modelUid"

    and get a response

        "status": "success" / "error",
        "msg": "error msg",
        "data": {}

After training

[API] getModelPreview
  • File location: src/service/analyticService/controller/

  • Description: get the preview of model

  • ==Usage==: GET http://host/analytic/preview with form

        "token": "token",
        "modelUid": "modelUid"

    and get a response

        "status": "success" / "error",
        "msg": "error msg",
        "data": {
            "text": "the preview text",
            "fig": {
                    "div": "fig1 div",
                    "script": "fig1 script"
[API] doModelPredict
  • File location: src/service/analyticService/controller/

  • Description: perform tprediction on a model using another file

  • ==Usage==: POST http://host/analytic/predict with form

        "token": "token",
        "modelUid": "modelUid",
        "fileUid": "fileUid",
        "preprocess": Enum(1, 0)

    and get a response

        "status": "success" / "error",
        "msg": "error msg",
        "data": {
            "preprocessedFileUid": "preprocessedFid", ("None" for no preprocess) 
            "predictedFileUid": "predictedFid"
[API] doModelTest
  • File location: src/service/analyticService/controller/

  • Description: perform test on a model using another file

  • ==Usage==: POST http://host/analytic/test with form

        "token": "token",
        "modelUid": "modelUid",
        "fileUid": "fileUid",
        "label": "label of abnormal detection testing"

    and get a response

        "status": "success" / "error",
        "msg": "error msg",
        "data": {
            "text": "the test result text",
            "fig": {
                    "div": "fig1 div",
                    "script": "fig1 script"
[API] deleteModel
  • File location: src/service/analyticService/controller/

  • Description: delete model

  • ==Usage==: POST http://host/analytic/delete with form

        "token": "token",
        "modelUid": "modelUid"

    and get a response

        "status": "success" / "error",
        "msg": "error msg",
        "data": {}
[API] getModelStatus
  • File location: src/service/analyticService/controller/

  • Description: get the status of a model

  • ==Usage==: GET http://host/analytic/get/status with form

        "token": "token",
        "modelUid": "modelUid"

    and get a response

        "status": Enum("success", "error"),
        "msg": "error msg",
        "data": Enum("success", "train", "fail")
[API] getModelParameter
  • File location: src/service/analyticService/controller/
  • Description: get the parameter (called algoInfo in code) of a model
  • ==Usage==: GET http://host/analytic/get/param with form
        "token": "token",
        "modelUid": "modelUid"
    and get a response
        "dataType": "num",
        "projectType": "classification",
        "algoName": "the algoname from getAlgoList",
        "param": (A json string) "{
            "paramName1" : 0.87,      # float example
            "paramName2" : 30,        # int example
            "paramName3" : 0,       # bool example
            "paramName4" : "option1", # enum example
            "paramName5" : "string"   # string example
        "input": (A json string) "{
            "input1" : ["col1","col2"], # multiple input example
            "input2" : ["col3"],         # single input example
        "output": (A json string. Pass "{}" for unsupervised project) "{
            "output1" : "col4",
            "output2" :  "col5"
[API] getModelFailReason
  • File location: src/service/analyticService/controller/

  • Description: get the fail reason of a model

  • ==Usage==: GET http://host/analytic/get/fail with form

        "token": "token",
        "modelUid": "modelUid"

    and get a response

        "status": "success" / "error",
        "msg": "error msg",
        "data": "the reason"