.. _nonmem_dat_to_popy_dat:

Nonmem Data to |popy| Data File
##################################

The |nonmem| data file has the same purpose as the |popy| |data_file|. It represents the |obss| at different time points and the dosing regimens for each subject. 

For simple data sets you may only need to substitute a few values to create a valid |popy| |data_file|. If you have multiple dosing regimes and multiple measurements then the conversion may be more difficult. 

:numref:`table_nonmem_data_fields` lists the the |popy| data fields for each |nonmem| data field. Each subsection gives a one to one example conversion to |popy| format.

.. _table_nonmem_data_fields:

.. list-table:: Nonmem to |popy| Data 
    :header-rows: 1

    * - |nonmem|
      - |popy|
      - Comment
      
    * - :ref:`EVID`
      - |TYPE| 
      - Data row property field
        
    * - :ref:`ID<nm_id>`
      - |ID| 
      - Identity field

    * - :ref:`TIME<nm_time>`
      - |TIME| 
      - Time field
        
    * - :ref:`CMT`
      - |na| 
      - Compartment field
      
    * - :ref:`AMT<nm_amt>`
      - |AMT| 
      - Dose Amount field
        
    * - :ref:`DV`
      - |obs_field|
      - Observations
      
    * - :ref:`MDV`
      - |obs_flag_field|
      - Missing observations

Both |popy| and |nonmem| need to load a |data_file| when estimating parameters. See :ref:`$DATA` for |nonmem|'s method of specifying the input data file path and how to specify the input data file path in a |popy| :ref:`fit_script`.
      
.. _EVID:
    
EVID
======

'EVID' is a required field in |nonmem|. There is an equivalent required |TYPE| field in |popy|. The major difference is that |nonmem| 'EVID' uses integers to define row properties, whereas |popy| uses human readable strings as shown in :numref:`table_nonmem_evid_values`.

.. _table_nonmem_evid_values:

.. list-table:: Nonmem EVID to |popy| TYPE 
    :header-rows: 1

    * - |nonmem| EVID
      - |popy| TYPE
      - Comment
      
    * - 0
      - obs 
      - Observation Row
      
    * - 1
      - dose
      - Dosing Row
      
    * - 2
      - pred
      - Prediction Row
        
    * - 3
      - reset
      - Reset Row
      
    * - 4
      - reset+dose
      - Reset and Dose Row
      
Note in |popy| the 'dose' |TYPE| entry can have a name suffix using |popy|'s ':' notation. See :ref:`specifying multiple dose types in PoPy<multi_dose_types>` for more details.

.. _nm_ID:
    
ID
=====

'ID' is a required field in |nonmem| and |popy|. The 'ID' column defines the individual for each data row. It is usually |not| necessary to convert the 'ID' column of the data set.

However, note that in |popy| the same identifier in the 'ID' field is **always** treated as the same individual. In |nonmem| only identical identifiers that are in consecutive rows are treated as one individual.

For example in :numref:`table_nonmem_ids`.

.. _table_nonmem_ids:

.. list-table:: Nonmem id example 
    :header-rows: 1

    * - ID
      - |nonmem|
      - |popy|
     
    * - Bill
      - New id
      - New id
      
    * - Bill
      - Bill again      
      - Bill again
      
    * - Sandra
      - New id
      - New id
        
    * - Bill
      - New id
      - Bill again      
      
    * - Sandra
      - New id
      - Sandra again   
   
|nonmem| thinks there are 4 subjects, whilst |popy| thinks there are 2.
      
      
.. _nm_time:
    
TIME
========

'TIME' is a required field in |nonmem| and |popy|. The 'TIME' column defines the time stamp for each row. It is usually |not| necessary to convert the 'TIME' column of the data set.

In both |nonmem| and |popy| the time field is required to be monotonically increasing, unless a |EVID| = 3 or 4 row is reached. In |popy| time is reset when a |TYPE| = 'reset' or 'reset+dose' row is reached.

One complication that can arise is if the |nonmem| data is split over date and time, for example see :numref:`table_nonmem_time`.

.. _table_nonmem_time:

.. list-table:: Nonmem id time 
    :header-rows: 1

    * - |nonmem| date
      - |nonmem| time
      - |popy| time
      
    * - 2016-02-12
      - 10:30
      - 0.0
    
    * - 2016-02-13
      - 19:01
      - 32.5167
          
    * - 2016-02-13
      - 19:02
      - 32.5333
      
    * - 2016-02-13
      - 23:39
      - 37.15
    
    * - 2016-02-13
      - 23:42
      - 37.2
      
    * - 2016-02-14
      - 10:06
      - 47.6
      
Here the data for each individual needs to be converted to the time after the first record (or time since last reset) for use in |popy|.    


.. _CMT:
    
CMT
=======

The 'CMT' field in |nonmem| is used to specify the index of the compartment where doses will be administered. |ie| rows with EVID=1 or EVID=4, with CMT=x will result in an bolus or infusion dose being administered in the compartment numbered 'x' in the :ref:`$DES` section of the |nm_ctl_file|.

|popy| deliberately does |not| specify the dose compartment in the |data_file|. Instead the dose compartment is specified in the |derivatives| section by the location of the :ref:`dosing_functions`.

If the data only contains one type of dose, |eg| one drug which is always a bolus or always an infusion, then you can just ignore the 'CMT' field when converting to |popy| format.

If the data contains multiple types of dose however then |popy| needs a way of distinguishing between the two types (|nonmem| uses a different CMT integer typically). In |popy| you need to give the dose a name, using the ':' notation. An example data conversion with two types of dose is shown in :numref:`table_nonmem_cmt`.
      
.. _table_nonmem_cmt:

.. list-table:: Nonmem cmt to |popy| dose name 
    :header-rows: 1

    * - |nonmem| evid
      - |nonmem| cmt
      - |popy| type
      - Comment
      
    * - 1
      - 1
      - dose:first_drug
      - drug one in first compartment
    
    * - 1
      - 1
      - dose:first_drug
      - drug one in first compartment
      
    * - 1
      - 3
      - dose:second_drug
      - drug two in third compartment
      
    * - 1
      - 1
      - dose:first_drug
      - drug one in first compartment
      
Note that the |popy| format above, leaves the destination compartment of each drug to be determined in the script file (the |derivatives| section), because the depot compartment for each drug is primarily a modelling decision, which might be changed in later analyses. 
      
.. _nm_AMT:
    
AMT
======

The |nonmem| AMT field specifies the amount of each dose. The same field can be used in |popy|, so usually the AMT field needs no conversion.

|nonmem| only allows a single AMT field to be used. If you have multiple doses, |eg| for two different drugs, then |nonmem| forces you to put all dose amount values in a single column in your data file, even if the amounts are in different units.

In your |popy| data file you might want to take the opportunity to use separate columns, |eg| 'AMT_DRUG1', 'AMT_DRUG2' as a way of making your |data_file| and |script_file| clearer.


.. _DV:
    
DV
========
 
The |nonmem| DV field specifies the observed measurements in a data file. For example plasma drug concentration for a |pk| study or biomarker data in a |pd| trial. The same field can be used in |popy|, so often the DV field requires no conversion.

However |nonmem| only allows a single DV field to be used. This has the same issues as the :ref:`AMT` field above. If you have multiple types of measurement in a study then |nonmem| forces you to place all measurement values in a single column, even if the values have different units.

In your |popy| |data_file| you might like to split the DV data into separate columns, |eg| 'CONC', and 'MARKER'. This will make your |data_file| and |script_file| easier to read.

An example data conversion with two types of measurement is shown in :numref:`table_nonmem_two_measurements`.
      
.. _table_nonmem_two_measurements:

.. list-table:: Nonmem DV to |popy| named fields 
    :header-rows: 1

    * - |nonmem| DV
      - CONC
      - CONC_FLAG
      - MARKER
      - MARKER_FLAG
      - Comment
      
    * - 5.3
      - 5.3
      - 1
      - 0
      - 0
      - conc obs
      
    * - 3
      - 0
      - 0
      - 3
      - 1
      - marker obs
      
    * - 5
      - 0
      - 0
      - 5
      - 1
      - marker obs
      
    * - 12.1
      - 12.1
      - 1
      - 0
      - 0
      - conc obs
 
Note in :numref:`table_nonmem_two_measurements` it is necessary to use the '_FLAG' field convention. The '_FLAG' field is similar to the |nonmem| :ref:`MDV` field, but you can have multiple '_FLAG' fields. In a flag field '1' means use this observation, '0' means ignore. The flag field means you don't have to use 'if statements' in the |predictions| section.

.. _MDV:
    
MDV
=========
 
The |nonmem| MDV (missing data value) column is used to ignore some observations. It is similar in function to the |popy| flag field syntax described in :ref:`DV`.

However the MDV indicator contains a double negative, an observation is valid in |nonmem| if MDV=0, |ie| it is |not| **missing**. The |popy| flag field is just yes/no, |ie| an observation X is valid if X_FLAG =1.

An example DV/MDV conversion is shown in :numref:`table_nonmem_mdv`.
      
.. _table_nonmem_mdv:

.. list-table:: Nonmem DV/MDV to |popy| flag fields 
    :header-rows: 1

    * - |nonmem| DV
      - |nonmem| MDV
      - CONC
      - CONC_FLAG
      - Comment
      
    * - 5.3
      - 0
      - 5.3
      - 1
      - valid obs
   
    * - na
      - 1
      - 0.0
      - 0
      - invalid obs
      
    * - 0.0
      - 1
      - 0.0
      - 0
      - invalid obs
   
    * - 2.9
      - 0
      - 2.9
      - 1
      - valid obs
      
Here:-

.. math::

    FLAG = 1 - MDV
      
Also, in |nonmem| you are only allowed to have one MDV field, which makes it less useful when you have multiple types of measurement.
 

