# Change log

## 2.0.2

-   Fix a bug with `read_naaccr_xml_plain` when a `record_format` is provided

## 2.0.0

-   New functions
    -   `read_naaccr_xml_plain` and `read_naaccr_xml` for reading files in
        NAACCR's XML formats
-   New data
    -   `naaccr_formats`, a named list containing `record_format` objects for
        the official NAACCR formats
    -   `naaccr_format_21`, `naaccr_format_22`, and `naaccr_format_23` for the
        NAACCR XML formats, versions 21, 22 and 23
-   Possible breaking changes
    -   New columns in `record_format`
        -   `"parent"`: the field's parent XML node
        -   `"width"`: field's length in characters
        -   `"cleaner"`: function used to clean/standardize data when reading
        -   `"unknown_finder"`: function to replace values with `NA` when reading
    -   Deprecated columns in `record_format`
        -   `"end_col"`: not used for XML files, and inferred from `"start_col"`
            and `"width"` for fixed-width files
    -   Package data now compressed using the `"xz"` method, which requires
        R version 2 or higher

## 1.0.0

-   Added `field_levels` and `field_levels_all` lists, which make it easy to see
    the possible values for factor fields.
-   All factor and sentinel levels are now easy to type on a standard U.S.
    QWERTY keyboard.
-   Repeatedly applying `naaccr_factor` or `naaccr_sentinel`
    (e.g., `naaccr_factor(naaccr_factor(...))`) is now a no-op.
-   Removed AJCC-copyright material.
-   Functions are safer: more argument checking, warnings, and option-robustness.
-   Fixed bugs with handling custom formats and columns not in the format.
-   Fixed bug with formats that included fields not found in text files

## 0.5.0

-   Processed date and time fields now store their original text values in the
    `"original"` attribute. Partial dates and times are still useful.
-   New functions
    -   `unknown_to_na`: Convert levels of NAACCR factor to `NA` if they mean
        the value is unknown.
    -   `naaccr_encode`: Convert processed values back to their NAACCR text
        codes.
    -   `as.record_format`: Create a custom record format from an object
        (this was incorrectly an internal function before).
    -   `naaccr_date`, `naaccr_datetime`: Parse NAACCR-formatted dates and times.
    -   `read_naaccr_plain`: Read a NAACCR file and divide it into fields, but
        don't do any more processing.
    -   `split_sequence_number`: Divide the different pieces of information in a
        tumor sequence number into multiple columns (i.e., order, uniqueness,
        and invasiveness).
-   Revamped `read_naaccr` and `write_naaccr`
    -   *Much* faster!
    -   `read_naaccr` has more parameters to handle how a file is read:
        `skip`, `nrows`, and `encoding` like in `read.table`, and `buffersize`
        to ease the pain of reading large files.
-   `naaccr_record` and its kin now have a `format` parameter to use custom
    formats.
-   Treat behavior fields as coded factor, as they should've been
-   **Potentially breaking changes**
    -   Cleaning functions (e.g., `clean_count`) no longer use a field name to
        retrieve cleaning parameters (e.g., field width).
        Those parameters now need to passed directly.
    -   The `type` and `alignment` columns of `record_format` objects are now
        factors.
    -   The standard NAACCR record format objects (`naaccr_format_12`, ...)
        are no longer lazy-loaded. They are now immediately loaded.
-   Improved documentation throughout.

## 0.4.0

-   Major improvements to handling of field-specific codes

## 0.3.0

### New features

-   New function: `write_naaccr` for writing `naaccr_record` data sets to text
    files according to one of the NAACCR formats. For now, it writes all unknown
    and missing values as blanks, not the standard codes. Still, `write_naaccr`
    and `read_naaccr` are inverse functions.
-   Fields with country codes are now converted to factors.

### Backend

-   Using package `ISOcodes` instead of `maps` for location code tables
