Tag Archives: R

R Groundhog

12.07.2023 admin

Reproducible science needs controlled environments.

Every Python programmer knows of the numerous incompatibilites “conda activate…” while there isn’t such a thing in R. Well until now… or at least as I learned it today…

Groundhog should be a R core function.

Aug 5, 2023

Thank you for pointing me to renv 1.0.0

We’re thrilled to announce the release of renv 1.0.0. renv has been around since 2019 as the successor to packrat, but this is the first time (!!) we’re blogging about it.

Software

DPI and PPI in scientific publications

6.02.2023 admin

It seems that also graphic design tasks are more and more shifted to the authors. A journal asked me to revise my R “figures with >6 px size fonts at 600 dpi” which does not make so much sense to me as it mixes apples and oranges (DPI and PPI) and does even give me any size estimate of the final image.

Fortunately, there is a great website that has a lot of helpful explanations and demo code.

In summary PPI are number of pixels per inch. Pixel is the smallest unit that can be displayed on a screen, the picture x element. So my screen has about 227 ppi which is super sharp while for historical reasons not the physical but a logical PPI of 96 is being assumed for screens. Continue reading DPI and PPI in scientific publications →

Software

A R function creating 96 well plate sample assignment

22.11.2022 admin

Most recently I had to add forgotten data to a SQL server database that was in use only until 2010. Script language was Cold Fusion at that time creating a visual interface to 96 well plates where proband IDs are assigned to plate positions by drop down fields.

R doesn’t have any good way for formatted data entry. Some shiny apps would be helpful but a bit overkill here. Data export into a javascript framework would be also a more professional solution while I just needed only a quick way to modify my database.

So I came up with some R code producing html code embedding javascript functions that can export the result as SQL code, so a 4 language mixup.

dataentry <- function(plate_id,city) {
# get orig IDs from database first

# position identifier A01, A02...
pos = NULL
for (i in 1:8) {
  pos = c(pos, paste0( rep( LETTERS[i], 12), str_pad( 1:12 ,2,pad="0" ) ) )
}
	
HTML <- paste('<html>
<script>
function download(filename, text) {
  var element = document.createElement("a");
  element.setAttribute("href", "data:text/plain;charset=utf-8," + encodeURIComponent(text));
  element.setAttribute("download", filename);
  element.style.display = "none";
  document.body.appendChild(element);
  element.click();
  document.body.removeChild(element);
}
function saveValues() {
  var frm = document.getElementById("f");
  var params = "";
  var sample_id = ',sample_id,';
  for( var i=0; i<frm.length; i++ ) {
    sample_id++;
    var fieldId = frm.elements[i].id;
    var fieldValue = frm.elements[i].value;
    if (fieldValue) params += "INSERT INTO samples (plate_id,plate_position,patient_id) VALUES (',plate_id,'," + fieldId + "," + fieldValue + ")\\n";
  }
  download("',plate_id,'.SQL",params);
}
</script>
<body>
<form id="f">')
	
# dropdowns
sel ='<option value=""></option>'
for (i in 1:dim(patients)[1]) {
  sel <- paste0(sel,'<option value=',patients[i,"patient_id"],'>',patients[i,"Orig_ID"],'</option>\n')
}
for (i in pos) {
  if (substr(i,2,3) == "01") HTML <- paste0(HTML,'<BR>')
  HTML <- paste(HTML,'<select id=',i,'>',sel,'</select>')
}
	
HTML <- paste(HTML,'</form>
<br><input name="save" type="button" value="SQL" onclick="saveValues(); return false"/>
</body></html>')
	
sink( paste('plate.html') )
cat(HTML)
sink()
}
dataentry(725,'city')

Software

How to scrape a website with R II: WYSIWYG

9.02.2021 admin

Part II

Although Rselenium allows a screenshot of the current browser window

library(RSelenium)
remDr <- remoteDriver(
  remoteServerAddr = "localhost",
  port = 4444,
  browserName = "Chrome"
)
remDr$open()
remDr$navigate("http://google.com")
remDr$screenshot(display = T) # remDr$screenshot(file="screen.jpg")

I found it extremely difficult to control a webbrowser running in a Docker container – looking up the DOM tree, injecting javascript etc is a lot of guess work.

So we need also a VNC server in the docker container as found at github.

After starting in the terminal

docker run -d -p 4444:4444 -p 5900:5900 -v /dev/shm:/dev/shm selenium/standalone-chrome:4.0.0-beta-1-prerelease-20210207

we can watch live at vnc://127.0.0.1:5900 what’s going on.

Software

Breaking up long chains in R’s dplyr/magrittr code and calling sub functions

26.01.2021 admin

Having several long and redundant chains in R ‘s magittr code, I have now figured out how to pipe into named and unnamed functions

f1 <- function(x) {
  x %>% count() %>% print()
}
f2 <- function(x) {
  x %>% tibble() %>% print()
}

So we have now two named functions with code blocks that can be inserted in an unnamed function whenever needed

iris %>%
  # do any select, mutate ....
  # before running both functions
  (function(x) {
    x %>% select(Species) %>% f1() %>% print()
    x %>% f2() %>% print()
  })

We can make a variable global from inside a function using <<- assignment but I have not found a way to continue the dplyr chain with any returned variable. So we are trapped inside the function

iris %>%
  (function(x) {
    result <<- x %>% head(1)
  })

Software

How to scrape a website with R I: Using a browser generated cookie

19.11.2020 admin

While there are quite some SO examples out there how to manage the login, here are the ncessary steps whenever you need to login in manually and have to start with a browser cookie. First install the “EditThisCookie” plugin in Chrome and export the cookie Continue reading How to scrape a website with R I: Using a browser generated cookie →

Software

Edit R plots in Keynote

4.07.2020 admin

Did not know but there is a library that is able to export directly to .pptx which can be imported into Keynote and edited after ungrouping.

Software

Graphical display of outbreaks: Transmission trees

30.06.2020 admin

Let’s start with some examples from the literature, find out necessary elements, compare different versions and develop a R template for general use.

“Exploding Stars”

CDC Field Epidemiology Handbook, 2019, ISBN 9780190933692 p115

“Tetris”

https://doi.org/10.1016/S0140-6736(20)30154-9 (2020)

“Clean Undeterministic”
“Clean Tree”

https://academic.oup.com/mbe/article/34/4/997/2919386

“Tree Addons”

https://openres.ersjournals.com/content/4/2/00162-2017

“Railways”

https://www.ages.at/service/service-presse/pressemeldungen/epidemiologische-abklaerung-am-beispiel-covid-19/

“Simple ”

https://www.ncbi.nlm.nih.gov/pubmed/32003551

“Vertical”

“star bus”

https://en.wikipedia.org/wiki/Index_case

“Mathematical”

https://www.biorxiv.org/content/10.1101/142570v3.full

“Genetics”

https://wwwnc.cdc.gov/eid/article/26/9/20-1798_article

Pages: 1 2

Software, Video

From R to Python

21.06.2020 admin

It’s bit confusing if you are having long-term experience with R but need some OpenCV Python code. What worked for me

download and install Python 3.8.3.
~~pip install opencv-python~~
pip install opencv-contrib-python
~~although Spyder or Jupyter is recommended for data science, I went for PyCharm~~
install Atom and follow the video instructions

take care, numerous non working introductions out there, stick to recent version

One World

A new animation of the famous HadCRUT4 climate dataset

29.10.2019 admin

Download

Here is the R sample code (PPT aspect ratio is 6:4, Youtube wants 16:9) .

As ggplot2 animation packages have major difficulties to manipulate the single frames, I am combining here raw PNGs using ffmpeg.

# read_cru_hemi() modified from https://mccartneytaylor.com/plotting-climate-change-on-a-spider-graph-using-r

list.of.packages <- c("ggplot2", "reshape", "stringr","RColorBrewer")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
lapply(list.of.packages, require, character.only = TRUE)

read_cru_hemi <- function(filename) {
  tab <- read.table(filename,fill=TRUE)
  nrows <- nrow(tab)
  hemi <- data.frame(
    year=tab[seq(1,nrows,2),1],
    annual=tab[seq(1,nrows,2),14],
    month=array(tab[seq(1,nrows,2),2:13]),
    cover=array(tab[seq(2,nrows,2),2:13])
  )
  hemi[,15:26][ hemi[,15:26]==0 ] <- c(NA)
  return(hemi)
}

url_dat <- "https://crudata.uea.ac.uk/cru/data/temperature/HadCRUT4-gl.dat"
tempdat <- read_cru_hemi(url_dat)
tempmelt <- melt(tempdat[,c(1,3:14)],id="year")

colfunc <- colorRampPalette(c("grey","grey","red"))
FadeToGrey <- colfunc(2019-1850)

new_theme <- theme_classic() + theme(
  text = element_text(size=18, colour="grey"),
  axis.line = element_blank(), 
  axis.text = element_text(colour="grey"),
  axis.ticks = element_line(colour="grey"),
  axis.title.x = element_blank(),
  panel.border = element_blank(),
  panel.grid.major = element_blank(),
  panel.grid.minor = element_blank(),
  panel.background = element_blank(),
  legend.position = "none"
)
theme_set(new_theme)

for(i in 1850:2019){
p <- ggplot(tempmelt[tempmelt$year %in% 1850:i,], aes(x=variable,y=value,color=as.factor(year),group=year)) + 
  geom_line() +
  scale_x_discrete( labels=month.abb) +
  scale_y_continuous( name="difference from baseline  [ oC ]", limits=c(-1,1) ) +
  annotate("text", x=11, y=1, label=i, size=7) +
  scale_color_manual( values=FadeToGrey[ 1:c(i-1849) ]  )
  fn <- paste("/Users/xxx/Desktop/X/",str_pad(i-1849, 3, pad = "0"),".png",sep="")
  ggsave(p, file=fn, width = 16, height = 9)
}

# not run
# ffmpeg -framerate 10 -i /Users/xxx/Desktop/X/%3d.png -r 5 -pix_fmt yuv420p -y /Users/xxx/Desktop/X/out.mp4

In comparison here is the original circular plot. Would require blue, green, yellow, red in the Color Ramp Palette…

Now it is only a minor step to the warming strips.

ggplot(tempdat, aes(x = year, y = 1, fill = annual))+
  geom_tile()+
  scale_y_continuous(expand = c(0, 0))+
  scale_x_continuous(expand = c(0, 0))+
  scale_fill_gradientn(colors = rev(col_strip)) +
  guides(fill = guide_colorbar(barwidth = 1)) +
  theme( axis.ticks= element_blank(),
         axis.text = element_blank(),
         axis.title = element_blank()
  )

tempmelt$variable <- as.numeric(str_replace(as.character(tempmelt$variable),"month.",""))
ggplot(tempmelt, aes(x = year, y = variable, z = value)) +
  geom_raster(aes(fill = value)) +
  scale_fill_gradientn(colors = rev(col_strip)) +
  scale_x_continuous(expand = c(0,0)) +
  scale_y_continuous(expand = c(0,0)) +
  theme( axis.ticks= element_blank(),
         axis.text = element_blank(),
         axis.title = element_blank()
  )

Software

Rstudio, knitr (Rmarkdown2) and bash

5.01.2015 admin

I couldn’t find any example online how to revise my R code getting the exif data from pictures

fn <- c("/usr/local/bin/exiftool /Users/wjst/Desktop/white.tif")
info <- system(fn,inter=TRUE,wait=TRUE)

when moving now to knitr. So here is what worked for me as a replacement including the parsing
of exiftool output.

```{r, engine='bash', echo=FALSE}
/usr/local/bin/exiftool /Users/wjst/Desktop/white.tif >/Users/wjst/Desktop/white.txt
```
```{r Exif, echo=FALSE}
fn <- '/Users/wjst/Desktop/white.txt'
info <- paste(readLines(fn))
info <- strsplit(info,"[:]{1}[ ]{1}")
info <- matrix(data=unlist(info), ncol = 2, byrow = TRUE)
info <- gsub("(^[[:space:]]+|[[:space:]]+$)", "", info)
```
*Exif*
`r kable(info)`

Joke, Software

Better figures

26.09.2014 admin

A recent paper identifies 10 rules for better pictures. As I have also given several lectures on that topic, I was excited what the authors think…
1. Know your audience. This is trivial as you never know your audience.
2. Identify your message. True and not true at the same time. True as it makes your findings more evident – not true if you are allowing a reader to find his own message.
3. Adapt the figure to the support medium. Trivial. May be very time consuming.
4. Captions are not optional. Absolutely true, I also suppport good captions – mini stories for those who can’t read the whole text.
5. Do not trust the defaults. Trivial. No one does.
6. Use color efficiently. Not really, avoid colors for those of us who are colorblind and to avoid expensive page charges.
7. Do not mislead the reader. Why should I?
8. Avoid Chartjunk. Absolutely. Most frequent problem.
9. Message trumps beauty. Sure, form follows function.
10. Get the right tool. Maybe correct while the further recommendations look like a poor man’s effort to make his first graphic at zero cost: Gimp, Imagemagick, R…