.. _p2ndat_n2pdat_data_conv:

Nonmem to |popy| Data conversions using P2NDAT and N2PDAT Scripts
####################################################################

See :ref:`nonmem_dat_to_popy_dat` for an overview of how the |nonmem| data format maps to |popy| format. It is very possible to use this format information to write your own data conversion script in a general purpose programming language, |eg| |r| or |python|. 

However, we provide a convenient :ref:`n2pdat_script`, to automatically convert from |nonmem| to |popy| without doing any programming. We actually provide two scripts that are mirror images of each other as follows:-

* :ref:`p2ndat_script` - converts from |popy| to |nonmem| data
* :ref:`n2pdat_script` - converts from |nonmem| to |popy| data
      
The two types of conversion scripts are illustrated in this section using the following files from the |popy| examples folder:-

.. code-block:: console

    c:\PoPy\examples\p2ndat_script.pyml
                     n2pdat_script.pyml
                     fit_example1_data.csv
    
Here 'fit_example1_data.csv' is in :ref:`input_data_format` and is the simple |pk| data file discussed in :ref:`simple_fit_example`. 'p2ndat_script.pyml' is a |popy| script that will convert the original |popy| 'fit_example1_data.csv' to |nonmem| format, see :ref:`p2ndat_example`. The 'n2pdat_script.pyml' will convert the newly created |nonmem| data file back to |popy| format, see :ref:`n2pdat_example`.

The data files in this section form a loop:-
    
.. parsed-literal::

    fit_example1_data.csv -:ref:`p2ndat<p2ndat_example>`-> fit_example1_nm_data.csv -:ref:`n2pdat<n2pdat_example>`-> fit_example1_data_v2.csv
    
Where 'fit_example1_data.csv' and 'fit_example1_data_v2.csv' are both compatible |popy| data files and 'fit_example1_nm_data.csv' is in |nonmem| format.
    
.. _p2ndat_example:
    
P2NDAT Example
=================

The first few rows of the original 'fit_example1_data.csv' file are shown in :numref:`table_fit_example1_popy_orig`.

.. _table_fit_example1_popy_orig:

.. csv-table:: Original data in |popy| format (first ten rows)
    :file: fit_example1_data_first_10_rows.csv

You can view the example :ref:`p2ndat_script`, :ref:`open_a_popy_command_prompt` in this folder:-

.. code-block:: console

    c:\PoPy\examples\

And type:-
    
.. code-block:: console

    $ popy_edit p2ndat_script.pyml
    
Then run using:-

.. code-block:: console

    $ popy_run p2ndat_script.pyml
    
This will create a new |nonmem| data file 'fit_example1_nm_data.csv'. The first ten rows are shown in :numref:`table_fit_example1_nm_data`.

.. _table_fit_example1_nm_data:

.. csv-table:: Output data in |nonmem| format (first ten rows)
    :file: fit_example1_nm_data_first_10_rows.csv
    
The differences between the input |popy| data :numref:`table_fit_example1_popy_orig` and the output |nonmem| data :numref:`table_fit_example1_nm_data`. Are summarised in the :numref:`table_popy_to_nonmem_comp`

.. _table_popy_to_nonmem_comp:

.. list-table:: Comparing |popy| 'fit_example1_data.csv' and |nonmem| 'fit_example1_nm_data.csv'
    :header-rows: 1
    
    * - Input |popy| column
      - Output |nonmem| column
      - Comments
            
    * - :ref:`TYPE`
      - :ref:`EVID`
      - reset->3, dose->1, obs->0
      
    * - :ref:`ID`
      - :ref:`ID<nm_id>`
      - no change
      
    * - :ref:`TIME`
      - :ref:`TIME<nm_time>`
      - no change
      
    * - AMT
      - :ref:`AMT<nm_amt>`
      - dose rows no change, obs/reset rows -> 0
      
    * - DV_CENTRAL
      - :ref:`DV`
      - no change
      
    * - DV_CENTRAL_FLAG
      - :ref:`MDV`
      - 1-DV_CENTRAL_FLAG
    
Note the corresponding columns are not in the same order between 'fit_example1_data.csv' and 'fit_example1_nm_data.csv'. The :ref:`p2ndat_script` has removed the 'TYPE', 'DV_CENTRAL' and 'DV_CENTRAL_FLAG' |popy| fields, to leave 'TIME', 'ID' and 'AMT', then added the newly created |nonmem| specific 'DV', 'MDV', 'EVID' and 'CMT' columns.

The 'fit_example1_nm_data.csv' contains a 'CMT' field to specify that the |nonmem| dosing occurs in compartment 1. |popy| specifies the dosing compartment entirely in the script file, see :ref:`dosing_fields`, so the output 'CMT' column has no corresponding column in the |popy| data file. You have to specify the 'CMT' value in your :ref:`p2ndat_script` manually, see :ref:`p2ndat_output_nonmem_fields`.

.. _p2ndat_script_syntax:
    
P2NDAT Script Syntax
----------------------

You can view the example :ref:`p2ndat_script` here:-

.. code-block:: console

    c:\PoPy\examples\p2ndat_script.pyml
    
Each section is discussed below.

.. _p2ndat_conv_method_options:

METHOD_OPTIONS
~~~~~~~~~~~~~~~~~~~~~~~

Just specifies the script type:-

.. code-block:: pyml

    METHOD_OPTIONS: {py_module: p2ndat}
   
See |method_options| for more info.
   
.. _p2ndat_conv_file_paths:

FILE_PATHS
~~~~~~~~~~~~~~~~~~~~~~~

Just specifies the input |popy| data file and output |nonmem| data file:-

.. code-block:: pyml

    FILE_PATHS:
        input_popy_file: fit_example1_data.csv
        output_nonmem_file: fit_example1_nm_data.csv

.. _p2ndat_input_popy_fields:
        
INPUT_POPY_FIELDS
~~~~~~~~~~~~~~~~~~~~~~~

Describes the columns of the input |popy| file:-

.. code-block:: pyml

    INPUT_POPY_FIELDS:
        time_field: TIME
        id_field: ID
        type_field: TYPE
        dv_fields: ['DV_CENTRAL']
        amt_fields: ['AMT']
        rate_fields: []
        dur_fields: []
        dose_labels: ['']
    
Here 'time_field', 'id_field' and 'type_field' are the |popy| data file :ref:`required_fields`. They default to the above values.

The 'dv_fields' is a list of |popy| :ref:`obs_fields` that will be moved into the |nonmem| |dv| field. Note you can specify multiple observed columns, each observed field will result in extra rows in the |nonmem| data output, as |nonmem| only ever has **one** |dv| observation column.  

The 'amt_fields' is a list of |popy| :ref:`dosing_fields`, |ie| columns that contain dose amounts. Similar to the 'dv_fields', if you specify multiple dosing amount columns, then the |nonmem| data output will contain extra rows, as |nonmem| only has one |nm_amt| field.

The 'rate_fields' and 'dur_fields' are blank because we only have bolus dosing here. If you have infusion dosing then add the :ref:`@inf_rate` and :ref:`@inf_dur` rate and duration parameters here.

The 'dose_labels' field contains the dosing names used in the |popy| data file. In this case dose_labels= [''] means |popy| dose names are |not| used. |ie| the |type| column just uses 'dose' values. If you use 'dose:my_dose_name', 'dose:my_other_dose_name' in your |popy| data file, to describe :ref:`multi_dose_types`, then you need to list the names here, |eg| ['my_dose_name', 'my_other_dose_name'].
    
.. _p2ndat_output_nonmem_fields:
    
OUTPUT_NONMEM_FIELDS
~~~~~~~~~~~~~~~~~~~~~~~

Describes the columns of the output |nonmem| file:-

.. code-block:: pyml

    OUTPUT_NONMEM_FIELDS:
        comment_prefix: '#'
        column_names: auto
        time_field: TIME
        id_field: ID
        evid_field: EVID
        dv_field: DV
        mdv_field: MDV
        amt_field: AMT
        rate_field: none
        dur_field: none
        cmt_field: CMT
        obs_cmt_numbers: [1]
        dose_cmt_numbers: [1]
    
Here 'comment_prefix' allows loading of |nonmem| data files with comment lines. Lines starting with the 'comment_prefix' symbol are ignored. 

'column_names: auto', uses the columns names in the '.csv' data file. You could rename them using a list here, a bit like the |nonmem| :ref:`$INPUT` section.

The 'time_field', 'id_field', 'evid_field', 'dv_field', 'mdv_field', 'rate_field', 'dur_field' and 'cmt_field' allows you to specify the |nonmem| key fields 'ID', 'EVID', 'DV', 'MDV', 'AMT', 'RATE', 'DUR' and 'CMT'. These fields default to the |nonmem| key names. 

Note that if you do not require some of the |nonmem| fields, |eg| in this case 'rate_field' and 'dur_field', because these only relate to infusion dosing and there is only bolus dosing in this example. Then you can assign null values using 'none'.

The 'obs_cmt_numbers' is a list of compartment indices to appear in the |cmt| column to be created by the :ref:`p2ndat_script`. The 'OUTPUT_NONMEM_FIELDS->obs_cmt_numbers' list must be the same length as the 'INPUT_POPY_FIELDS->dv_fields' list. The elements of both lists must correspond to the same type of observation. |eg| in this case all |popy| observations 'DV_CENTRAL' occur in |nonmem| compartment one. The :ref:`p2ndat_script` will copy the |popy| 'DV_CENTRAL' value into the |nonmem| |dv| column for all rows with |type| ='obs' and set |mdv| =0 for these rows.

The 'dose_cmt_numbers' is a list of compartment indices to appear in the |cmt| column to be created by the :ref:`p2ndat_script`. The 'OUTPUT_POPY_FIELDS->dose_cmt_numbers' list must be the same length as the 'INPUT_POPY_FIELDS->amt_fields' list. The elements of both lists must correspond to the same type of dose. |eg| in this case all |popy| dose amounts 'AMT' occur in |nonmem| compartment one. The :ref:`p2ndat_script` will copy the |popy| 'AMT' value into the |nonmem| |nm_amt| column for all rows with |type| ='dose' and set |nm_amt| =0.0 for other rows.

If you have multiple doses and multiple observation fields in your input |popy| data, then you have to specify the dv_fields/obs_cmt_numbers and amt_fields/dose_cmt_numbers list pairs carefully.
    
    
.. _p2ndat_output_options:
    
OUTPUT_OPTIONS
~~~~~~~~~~~~~~~~~~~~~~~

Describes the output options. Currently, the only option is to remove fields from the final data file:-

.. code-block:: pyml
    
    OUTPUT_OPTIONS:
        drop_fields: ['TYPE', 'DV_CENTRAL', 'DV_CENTRAL_FLAG']
    
Here we are removing the old |popy| fields from the |nonmem| data output. This is useful in this case, as we wish to demonstrate regenerating the orig |popy| fields, when we use a :ref:`n2pdat_script` in the next section.
    
.. _n2pdat_example:
    
N2PDAT Example
=================
   
The :ref:`p2ndat_script` converts data from |popy| to |nonmem| format. Here we discuss the :ref:`n2pdat_script` that computes the inverse conversion from |nonmem| to |popy| format.
    
Assuming you have run the :ref:`n2pdat_example`, view the example :ref:`n2pdat_script` in your text editor, by typing:-
    
.. code-block:: console

    $ popy_edit n2pdat_script.pyml
    
Then run the :ref:`n2pdat_script` using:-

.. code-block:: console

    $ popy_run n2pdat_script.pyml
    
This will create a new |popy| data file 'fit_example1_data_v2.csv'. The first ten rows are shown in :numref:`table_fit_example1_new_popy_data`.

.. _table_fit_example1_new_popy_data:

.. csv-table:: Output data in |nonmem| format (first ten rows)
    :file: fit_example1_data_v2_first_10_rows.csv

    
The differences between the input |nonmem| data :numref:`table_fit_example1_nm_data` and the output |popy| data :numref:`table_fit_example1_new_popy_data`. Are summarised in the :numref:`table_nonmem_to_popy_comp`

.. _table_nonmem_to_popy_comp:

.. list-table:: Comparing |nonmem| 'fit_example1_nm_data.csv' and |popy| 'fit_example1_data_v2.csv'
    :header-rows: 1
    
    * - Input |nonmem| column
      - Output |popy| column
      - Comments
            
    * - :ref:`TIME`
      - :ref:`TIME<nm_time>`
      - no change
      
    * - :ref:`ID`
      - :ref:`ID<nm_id>`
      - no change
      
    * - AMT
      - AMT
      - no change
      
    * - :ref:`DV`
      - DV_CENTRAL
      - no change
      
    * - :ref:`MDV`
      - DV_CENTRAL_FLAG
      - 1-MDV
      
    * - :ref:`EVID`
      - :ref:`TYPE`
      - 3->reset,1->dose:_bolus,0->obs
      
    * - :ref:`CMT`
      - |na|
      - |popy| has no 'CMT' equivalent
    
The :ref:`n2pdat_script` has removed 'DV', 'MDV', 'EVID' and 'CMT' |nonmem| fields from 'fit_example1_nm_data.csv' and replaced them with 'TYPE', 'DV_CENTRAL' and 'DV_CENTRAL_FLAG' |popy| fields in 'fit_example1_data_v2.csv'.  

The 'fit_example1_data_v2.csv' contains **no** 'CMT' field because |popy| specifies the dosing compartment entirely in the script file, see :ref:`dosing_fields`.
    
.. _n2pdat_script_syntax:
    
N2PDAT Script Syntax
----------------------

You can view the example :ref:`n2pdat_script` here:-

.. code-block:: console

    c:\PoPy\examples\n2pdat_script.pyml
    
Each section is discussed below.

.. _n2pdat_conv_method_options:

METHOD_OPTIONS
~~~~~~~~~~~~~~~~~~~~~~~

Specifies the script type:-

.. code-block:: pyml

    METHOD_OPTIONS: {py_module: n2pdat}
   
See |method_options| for more information.
   
.. _n2pdat_conv_file_paths:

FILE_PATHS
~~~~~~~~~~~~~~~~~~~~~~~

Specifies the input |nonmem| data file and output |popy| data file:-

.. code-block:: pyml

    FILE_PATHS:
        input_nonmem_file: fit_example1_nm_data.csv
        output_popy_file: fit_example1_data_v2.csv

.. _n2pdat_input_nonmem_fields:
        
INPUT_NONMEM_FIELDS
~~~~~~~~~~~~~~~~~~~~~~~

Describes the columns of the input |nonmem| file:-

.. code-block:: pyml

    INPUT_NONMEM_FIELDS:
        comment_prefix: '#'
        column_names: auto
        date_field: none
        date_format: none
        time_field: TIME
        id_field: ID
        evid_field: EVID
        dv_field: DV
        mdv_field: MDV
        amt_field: AMT
        rate_field: none
        dur_field: none
        cmt_field: CMT
        obs_cmt_numbers: [1]
        dose_cmt_numbers: [1]
  
This is the same as the :ref:`p2ndat_output_nonmem_fields` section. The only difference is that this section is now describing an **input** |nonmem| data file instead of an **output** |nonmem| data file.
  
The 'obs_cmt_numbers' and 'dose_cmt_numbers' list have to correspond to the 'dv_fields' and 'amt_fields' in the :ref:`n2pdat_output_popy_fields` section to get sensible |popy| data output. See below for more explanation.
  
.. _n2pdat_output_popy_fields:
    
OUTPUT_POPY_FIELDS
~~~~~~~~~~~~~~~~~~~~~~~

Describes the columns of the output |popy| file:-

.. code-block:: pyml

    OUTPUT_POPY_FIELDS:
        time_field: TIME
        id_field: ID
        type_field: TYPE
        dv_fields: ['DV_CENTRAL']
        amt_fields: ['AMT']
        rate_fields: []
        dur_fields: []
        dose_labels: ['']
    
This is the same as the :ref:`p2ndat_input_popy_fields` section. The only difference is that this section is now describing an **output** |popy| data file instead of an **input** |popy| data file.
    
..  comment
    this line is in  p2ndat_input_popy_fields
    The 'time_field', 'id_field' and 'type_field' are the |popy| data file :ref:`required_fields`. They default to the above values.
    
Here the 'dv_fields' is a list of |popy| observation columns to be created by the :ref:`n2pdat_script`, based on the input |nonmem| |dv| field. The 'OUTPUT_POPY_FIELDS->dv_fields' list must be the same length as the 'INPUT_NONMEM_FIELDS->obs_cmt_numbers' list. The elements of both lists must correspond to the same type of observation. |eg| in this case all |nonmem| observations occur in compartment one, so for |nonmem| data rows with |evid| =0 and |cmt| =1 the |nonmem| |dv| value is copied into the |popy| DV_CENTRAL column with DV_CENTRAL_FLAG=1.

The 'amt_fields' is a list of |popy| dose amount columns to be created by the :ref:`n2pdat_script`, based on the input |nonmem| |nm_amt| field. The 'OUTPUT_POPY_FIELDS->amt_fields' list must be the same length as the 'INPUT_NONMEM_FIELDS->dose_cmt_numbers' list. The elements of both list must correspond to the same type of dose. |eg| in this case all |nonmem| doses occur in compartment one, so for |nonmem| data rows with |evid| =1 and |cmt| =1 the |nonmem| |nm_amt| value is copied into the |popy| |amt| column, with all other rows set to zero.

If you have multiple doses and multiple observation fields in your input |nonmem| data, then you have to specify the obs_cmt_numbers/dv_fields and dose_cmt_numbers/amt_fields list pairs carefully.
    
.. _n2pdat_output_options:
    
OUTPUT_OPTIONS
~~~~~~~~~~~~~~~~~~~~~~~

Describes the output options, currently, just which fields to remove:-

.. code-block:: pyml
    
    OUTPUT_OPTIONS:
        drop_fields: ['DV', 'MDV', 'EVID', 'CMT']

Here we are removing the |nonmem| specific fields. In a real life conversion it may be sensible to keep the |nonmem| fields, so that you can perform a side by side sanity check from within the |popy| output file. Note the fields above will be of little use to a |popy| :ref:`fit_script`, compared to the 'DV_CENTRAL', 'DV_CENTRAL_FLAG' and 'TYPE' fields, created by the :ref:`n2pdat_script`.

.. _regen_popy_comp:

Compare original |popy| data with P2NDAT/N2PDAT version
===========================================================

In this walk through we have taken a |popy| data file 'fit_example1_data.csv', run :ref:`p2ndat_script` to create a |nonmem| data version. Then we ran :ref:`n2pdat_script` to re-create the original |popy| data file 'fit_example1_data_v2.csv' from the |nonmem| data.

You can compare the first 10 rows of both the input |popy| data set in :numref:`table_fit_example1_popy_orig` and the output |popy| data in :numref:`table_fit_example1_new_popy_data`.

Both files contain the same column headers |ie| 'TYPE', 'ID', 'TIME', 'AMT', 'DV_CENTRAL', 'DV_CENTRAL_FLAG'. The values in each column are the same apart from 'AMT' column has zero values in non-dose rows. Also the 'dose' value in the |type| field is now 'dose:_bolus'. Both the input and output .csv files are valid |popy| data formats for the |pkpd| problem described in :ref:`simple_fit_example`. 

