DIDM = Data-Informed Decision Making
[1.0.0] Ask: Question
        [1.1.0] Define Problem
        [1.2.0] Choose Measure
        [1.3.0] Determine Factors
[2.0.0] Acquire: Data
        [2.1.0] Design Experiment
        [2.2.0] Collect Data
                [2.2.1] Data Extract
                [2.2.2] Data Load:      Excel 
                                        > Data > Get Data> From File > From Excel Workbook 
                                        > Select WB > Navigator > Select Sheets > Transform Data
                [2.2.3] Data Transform
[3.0.0] Analyze: Data
        [3.1.0] Box-Plot 

                IQR = Inter-quartile Range = Q3 - Q1
                
                Q4_Max = largest value in sample, assume outlier
                +------------------------------------- FU = Fence_Upper = Q3 + 1.5 * IQR
                +--------------------------- L = Whisker_High = (largest value < FU)
                Q3
                Q2_Median, Mean, Mode
                Q1
                +--------------------------- S = Whisker_Low = (smallest value > FL)
                +------------------------------------- FL = Fence_Lower = Q1 - 1.5 * IQR
                Q0_Min = smallest value in sample, assume outlier

        [3.2.0] Iterate
[4.0.0] Apply: Expertise, Insights, Data Story
        Who is audience?
        What key insight?
        What story main message?
        What chart type for visualization?
        What is audience action target?
[5.0.0] Announce: Decide, Communicate 
[6.0.0] Assess: Monitor outcome
        Bias?
ASK = Questions
   What is happening?
   Why?
   Will it happen again?
   What will happen if we make change to inputs?
   What is the data telling us?

ANALYZE = Analytical Methods
   Mean, Median, Mode, SD, min, max

ANALYZE = Predictive Analytics
   Logistic Regression
   Linear Regression

ANALYZE = Prescriptive Analytics
   How to make the best happen?
   Maximize profit
      Maximize revenue
      Minimize cost
Step 1 = Discovery
   ASK = Define business challenge
      How can we maximize profit
         Maximize Revenue
            Match demand and supply of containers, by location
            Reduce Volume of Empty Container 
         Minimize Cost
            Reduce Dwell Time

   ASK = Identify key analytical questions
      What factors contribute to Dwell time?
      Recognize stakeholders
         The company and shareholders

   ACQUIRE = Evaluate data sources (internal, external)
      Internal Company data
         Volume, Dwell
      External data 
         Location holidays
         Location distances
      Define Success
         Get valuable insights for decision making from the data

      Data source?
      Data sample represent population?
      Data distribution include outliers? Affect results?
      Assumptions behind analysis? 
      Any conditions violate assumptions ==>  model invalid?
      Analytical approach? Any alternatives?
      Causality: dX causes dY? 
      Correlation != Causality
      Root Cause = Why * 5

      [Population]
      +-- Random Select 
          +-- [Sample]
              +-- Random Assign
                  +-- [Group_Control]
                  +-- [Group_Treatment]

      Avoid Bias
      +-- First-conclusion
      +-- Confirmation
      +-- Survivor


Step 2 = Data Prep
   ACQUIRE = Extract data from DB
      .xlsx file
   ACQUIRE = Load data into workspace
      Databricks
   ANALYZE = Data quality check
      Data Exploration
         Excel
         Power BI
         Python, Streamlit
Step 3 = Model Planning
   ACQUIRE = Transform data 
                Numerical
                Categorical
Step 4 = Model Building
   ANALYZE = Multiple Linear Regression
      Simple Linear Regression
   ANALYZE = Visualize
Step 5 = Results Communicate
   ANNOUNCE = Teams
   ANNOUNCE = TLoom
   ANNOUNCE = In-person classes (after)
Step 6 = Operationalize    
   ASSESS = Predict Dwell Times
   APPLY = Do the results make sense?