Country analysis of PubPeer annotated articles

Just out of curiosity, after Scihub now an analysis of papers commented at the PubPeer website. Pubpeer is now also screened on a regular basis by Holden Thorp, the chief editor of Science…

Unfortunately I am loosing many records for incomplete or malformed addresses, while some preliminary conclusions can already be made when looking at my world map.

pubpeer.R grey indicates no data, black only a few, red numerous entries.

A further revision will need to include more addresses and also overall research output as a reference.

Other country level data are also interesting. Just to name a few

1/

Trust scores by continent https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10393470/pdf/rmmj-14-3-e0015.pdf

2/

image duplication by country https://link.springer.com/article/10.1007/s11948-018-0023-7/figures/3

3/

table 1 abbreviated https://link.springer.com/article/10.1007/s11948-017-9939-6

4/

retractions 2020 https://www.biorxiv.org/content/10.1101/2020.04.29.063016v6

5/

Retractions https://jamanetwork.com/journals/intemed/articlepdf/2779425/jamainternal_gaudino_2021_ld_210018_1627675018.88326.pdf
pubpeer[,"affiliation_list"] %>%
	(function(x) {
		strsplit(x,",") %>% sapply(., tail, 1) %>% unlist()
	}) %>%
	word(.,-1) %>%
	as_tibble() %>%
	mutate(value = gsub("[\\.]+", "",value)) %>%
	mutate(value = gsub("PR.*", "China",value)) %>%
	mutate( value = case_when
		(value== "" ~ NA,
		value== "States" ~ "USA",						
		value== "University" ~ NA,
		value== "Republic" ~ NA,
		value== "ROC" ~ NA,	
		value== "and" ~ NA,
		value== "Maryland" ~ "USA",
		value== "(mainland)" ~ "China",
		value== "Kong" ~ "HongKong",
		value== "Chemistry" ~ NA,
		value== "Engineering" ~ NA,	
		value== "PAK" ~ "Pakistan",
		value== "Arabia" ~ "South Arabia",
		value== "Sciences" ~ NA,
		value== "Technology" ~ NA,
		value== "Medicine" ~ NA,
		value== "NY" ~ "USA",	
		value== "America" ~ "USA",	
		value== "York" ~ "USA",	
		value== "Massachusetts" ~ "USA",	
		value== "Hospital" ~ NA,
		value== "Zealand" ~ "New Zealand",
		value== "Pennsylvania" ~ "USA",
		value== "Africa" ~ "South Afria",
		.default = value)) %>%
	group_by(value) %>%
	count(value) %>%
	arrange( desc(n) ) %>%
	rename(region=value) %>%
	right_join(map_data('world'), by="region" ) %>%
	filter(region != "Antarctica") %>%
	ggplot() +
	geom_polygon(aes(long, lat, group = group, fill = n)) +
	coord_quickmap() +
	scale_fill_gradient(name = "N", trans = "log10", breaks = c(10,100,1000), low = "black", high = "red", na.value = "lightgrey") +
	theme_void()