---
title: "Creating custom import functions"
author: "Philippe Massicotte"
bibliography: biblio.bib
date: "`r Sys.Date()`"
format:
  html:
    minimal: true
    toc: true
    html-math-method: mathjax
vignette: >
  %\VignetteIndexEntry{Creating custom import functions}
  %\VignetteEngine{quarto::html}
  %\VignetteEncoding{UTF-8}
knitr:
  opts_chunk:
    collapse: true
    comment: "#>"
---

When `eemR` was originally created, I wrote few functions to import eems derived from the spectrofluorometers I knew. Given the high diversity in file formats, `eemR` now offers the possibility for the users to write their own import function.

## Introduction

In this example, we will learn how to create a import function for a specific eem file generated by the software of a Cary Eclipse spectrofluorometer. First, lets have a look to the content of `custom_cary.csv`.

## Example: Importing Cary Eclipse Spectrofluorometer Data

```{r}
#| comment: ""
file <- system.file("extdata/custom_cary.csv", package = "eemR")

cat(readLines(file, n = 25L), sep = "\n")
```

From this output, we see that:

1. The emission wavelengths are contained in the first column (250, 252, ...).
2. The excitation wavelengths are in the first row (50, 55, ...). They are offsets to be applied to each excitation. For examples:
   - The emissions at excitation 250 nm are 250 + 50, 250 + 55, 250 + 60, ...
   - The emissions at excitation 252 nm are 252 + 50, 252 + 55, 252 + 60, ...
3. The fluorescence data start at row 3 and column 2.

### Function Requirements

The first thing we need to do is to create a function that will read this data in a format that can be used by `eemR`. **The function needs to meet these flowing criteria:**

1. Have an argument `file` that will contains the path of the file(s) to read.
2. Return a list containing the flowing elements:
   - `file`: the path of the file
   - `em`: a numeric vector containing emission wavelengths.
   - `ex`: a numeric vector containing excitation wavelengths.
   - `x`: a matrix of `length(em)` rows by `length(ex)` columns.

Further more, the `x` matrix of the list need to be arranged as follow:

![](matrix.png)

Let's now write a function that will read the eem file and format the data accordingly. In this particular example, fluorescence has been measured in asynchronous mode. Hence, extra manipulations are needed to get the data on a regular grid.

### Writing the Import Function

```{r}
#| message: false

library(dplyr)
library(tidyr)
library(eemR)

import_cary <- function(file) {
  dat <- read.csv(file, nrows = 102L, skip = 1L)

  ex <- na.omit(dat[, 1L])
  em <- seq(50L, 330L, by = 5L)

  em <- outer(ex, em, "+")
  em <- as.vector(em)
  ex <- rep(ex, 57L)

  x <- dat[, -1L]
  x <- x[-1L, ]

  x <- matrix(
    as.numeric(unlist(x, use.names = FALSE)),
    ncol = 101L,
    byrow = FALSE
  )

  res <- tibble(ex, em, x = as.vector(x)) |>
    arrange(ex, em) |>
    complete(ex, em, fill = list(x = NA))

  ex <- sort(unique(ex))
  em <- sort(unique(em))
  x <- matrix(res$x, ncol = length(ex), byrow = TRUE)

  # We need to interpolate because you do not have a regular grid (i.e.
  # asynchronous)
  r <- MBA::mba.surf(
    res |> drop_na(),
    no.X = 200L,
    no.Y = 200L,
    extend = FALSE
  )

  l <- list(
    file = file,
    x = t(r$xyz.est$z),
    ex = r$xyz.est$x,
    em = r$xyz.est$y
  )

  l
}
```

We can now try our function and have a look to the structure of the returned list.

### Testing the Import Function

```{r}
str(import_cary(file))
```

We will use the `import_function` argument of the `eem_read()` function to tell `eemR` how to read our file.

```{r}
eem <- eem_read(file, import_function = import_cary)

eem
```

We can visualize the eem by using the `plot()` function:

### Visualizing the EEM Data

```{r}
#| fig.width: 8
#| fig.height: 6
plot(eem)
```

### Using Other Functions in `eemR`

```{r}
#| fig.width: 8
#| fig.height: 6

# Remove second order Rayleigh scattering
plot(eem_remove_scattering(eem, "rayleigh", order = 2L, width = 15L))

# Extract Coble' peaks
eem_coble_peaks(eem)
```
