Category Archives: Software

ImageTwin

I confess that I worked together with the founder of ImageTwin some years ago, even encouraging him to found a company. I would have even been interested in a further collaboration but unfortunately the company has cut all ties.

Should we really pay now 25€ for testing a single PDF?

price list 2023

My proposal in 2020 was to build an academic community with ImageTwin’s  keypoint matching approach.  AI  analysis and image depository would be a nice along with more comprehensive reports than just drawing boxes around duplicated image areas.

A research paper  by new ImageTwin collaborators now finds

Duplicated images in research articles erode integrity and credibility of biomedical science. Forensic software is necessary to detect figures with inappropriately duplicated images. This analysis reveals a significant issue of inappropriate image duplication in our field.

Unfortunately the authors of this paper are missing the integrity nomenclature  flagging only images that are expected to look similar.

Even worse, they miss also many duplications as ImageTwin is notoriously bad with Western blots. Sadly, this paper erodes the credibility of forensic image analysis.

 

Oct 4, 2023

The story continues. Instead of working on a well defined data set and determining sensitivity, specificity, etc. of the ImageTwin approach, another preprint  (bioRxiv, Scholar) shows that

Toxicology Reports published 715 papers containing relevant images, and 115 of these papers contained inappropriate duplications (16%). Screening papers with the use of ImageTwin.ai increased the number of inappropriate duplications detected, with 41 of the 115 being missed during the manual screen and subsequently detected with the aid of the software.

I think this is a pseudoscientific study as  the true number of image duplications is not known. We cannot no more verify what ImageTwin does as it is behind a paywall contradicting basic scientific principles. The accompanying news report by Anil Oza  makes it even worse.

It is just wrong that the software is “working at two to three times David’s speed” – it is 20 times faster but  giving also numerous false positives.  It is wrong that “Patrick Starke is one of its developers”(Starke is a sales person not a developer). So at the end, the Oza news report is just a PR stunt as confirmed on Twitter on the next day

https://twitter.com/ImageTwinAI/status/1709842276929728610

Unfortunately ImageTwin has now been fallen back into the same league as Acuna et al. Not unexpected, Science Magazine has choosen Proofig for image testing, despite the nice groupshot of Starke and some other image sleuths.


CC-BY-NC

R Groundhog

Reproducible science needs controlled environments.

Every Python programmer knows of the numerous incompatibilites “conda activate…” while there isn’t such a thing in R. Well until now… or at least as I learned it today…

Groundhog should be a R core function.

 

https://groundhogr.com/

Aug 5, 2023

Thank you for pointing me to renv 1.0.0

We’re thrilled to announce the release of renv 1.0.0. renv has been around since 2019 as the successor to packrat, but this is the first time (!!) we’re blogging about it.


CC-BY-NC

Statistical parrot

Harald Lesch talks about AI language models  as “statistical parrots”. Even more worrisome are the hallucinations

“Language models are trained to predict the next word,” said Yilun Du, a researcher at MIT who was previously a research fellow at OpenAI, and one of the paper’s authors. “They are not trained to tell people they don’t know what they’re doing.”


CC-BY-NC

GPSless theft protection with a mobile router

After some try and error, I am invoking now an own startup script on a Teltonika RU950 router

opkg install nano
nano —saveonexit /etc/rc.local
source /etc/myscript.sh
CTRL+x

and here is the script

touch /etc/myscript.sh
chmod +x /etc/myscript.sh
nano —saveonexit /etc/myscript.sh

sending the following variables

cellid=$(gsmctl -C)
op=$(gsmctl -o)
lac=$(gsmctl -A 'AT+CGREG?' | cut -d'"' -f 2)
gsmctl —sms —send “0049********** @op @lac $cellid”
CTRL+x

CC-BY-NC

Science Foo Camp

Where I would really like to go at least on time in life is the Science Foo Camp. Unfortunately I have been never invited… Meetings with no fixed agenda are great but maybe to anarchic for DFG?

 


CC-BY-NC

Call for an AI moratorium: Pause Giant AI Experiments

More than 1,000 technology leaders and researchers … have urged artificial intelligence labs to pause development of the most advanced systems, warning in an open letter that A.I. tools present “profound risks to society and humanity.”
A.I. developers are “locked in an out-of-control race to develop and deploy ever more powerful digital minds that no one — not even their creators — can understand, predict or reliably control,” according to the letter, which the nonprofit Future of Life Institute released on Wednesday.

I signed the letter also (although some other people may have signed for other reasons).

 

May 5, 2023

30,000 signatures by today while the White House now also

pushed Silicon Valley chief executives to limit the risks of artificial intelligence, telling them they have a “moral” obligation to keep products safe, in the administration’s most visible effort yet to regulate AI.


CC-BY-NC

Audio/video sync + latency test file

Here is an 60 fps audio/video test file with a 440 Hz beep every 3s to test latency in OBS.

Direct download here.

 

Continue reading Audio/video sync + latency test file


CC-BY-NC

Remote control of any Phase One from a tablet

While it is straightforward to connect a new Hasselblad X2D 100C to an iPad for viewing high res images, it’s a pain with older Phase One backs like the p65+. These older backs including the p45+ haven an excellent image quality that is even a decade after it’s introduction on par with much newer IQ backs. Unfortunately the old backs have only a small screen with a menu system from the 1980ies which is basically unusable. The biggest issue, however, comes by the Firewire connection that has been abandoned by Apple preventing older Mamiya/Phase One cameras from any LAN or WLAN access by Capture One. Nevertheless I tried  a Firewire dock as recommended by Phase One only to find out that it does not work with any current macOS.

So why not going back and using some contemporary computer?

What finally worked was a Mac mini from Ebay from late 2012 (100€). This unit still has firewire ports and works perfectly with Capture One 12 under Catalina. Bothe packages are in the archives (if not already pre-installed as on my Ebay purchase) . I tried to control the Mac mini remotely with VNC from the iPad but this was not an option as the connection was neither responsive nor did the screen scale correctly. There is however a better solution for the so called “headless use” of the Mac mini without any display and keyboard.

The installation needs of course display and keyboard at the start. Both devices need to be in the same WLAN with the same GHz band (in my case 2.4 GHz due to the mini). Any VPN and firewall should be set off before installation of the luna display app on both machines. We also need a small HDMI display port dongle from astropad.com (50€) that connects the devices.

iPad screenshot connected to Mac mini using luna display app. All icons can be reached.
iPad screenshot connected to Mac mini with Capture One open.

But then we can remove keyboard and screen from the Mac Mini, while it is super nice to control now the Phase One  via original Capture One by a touch display. It works wireless and by USB connection between Mac Mini and iPad, while it should work with any Phase One camera. Best of all, even an iPhone with C1 Pilot will connect to Capture One on the Mac Mini.

The only drawback:  this current solution is not fully portable as the Mac mini needs 220 V. But luckily, there are 12V DC conversion kits in the high end audio market (120€).

Or you just use your big flash power unit that can output 110/220V…


CC-BY-NC

A real, no-fake Springer Nature press release

Springer Nature continues its focus on tailored solutions for academics with acquisition of researcher-created writing tool, TooWrite (14 Feb 23)

Developed by researchers for researchers, the TooWrite platform streamlines and simplifies scientific writing by guiding researchers through the process as if they were answering a questionnaire. In addition, expert how-to guides are attached to each question, supporting researchers as if they had an editor by their side. By structuring it in this step by step way, researchers’ time is freed up by making the writing process more efficient.

 

the comment that hits the nail

At some point there will be nothing left to buy. But then there will be no way out for researchers anymore from ready-made workflows that suck them dry at every stage of the research life cycle.

https://openbiblio.social/@RenkeSiems/109867215004553634 16Feb23 

 

the strategy is from “cradle to grave”

https://twitter.com/brembs/status/1625871585428004865 17Feb23

 

in the original full version

https://101innovations.wordpress.com/workflows/ 17Feb23

CC-BY-NC

How to stream from your phone

Wireless transmission worked perfect for me with Elgato’s EpocCam transmission from iPhone to Macbook where I am using OBS . And well, the native USB connection works now out of the box since macOS 13 Ventura.

But how to attach the mobile cam under Linux? So far I connected only DSLRs but found only recently DroidCam that  includes not only Android and iPhone but also a Linux client. As a bonus multiple remote actions are possible from setting white balance to focus mode or using flash settings.

 


CC-BY-NC

How to regulate ChatGPT use

With the even increasing use of ChatGPT there is also a debate not only on responsibility but also crediting findings to individual authors.

The artificial-intelligence (AI) chatbot ChatGPT that has taken the world by storm has made its formal debut in the scientific literature — racking up at least four authorship credits on published papers and preprints.  Journal editors, researchers and publishers are now debating the place of such AI tools in the published literature, and whether it’s appropriate to cite the bot as an author.

Software recognition of AI generated text is not 100% accurate in particular if there are less than 1000 characters available. And of course, scientific texts will be always edited to evade the classifier.

Having discussed here this issue yesterday, we think that we need some kind of software regulation – sending the generated AI output not only to the individual user but keeping a  full logfile of the output that can be accessed, indexed and searched by everybody.

 

 

 


CC-BY-NC

DPI and PPI in scientific publications

It seems that also graphic design tasks are more and more shifted to the authors. A journal asked me to revise my R “figures with >6 px size fonts at 600 dpi” which does not make so much sense to me as it mixes apples and oranges (DPI and PPI) and does even give me any size estimate of the final image.

Fortunately, there is a great website that has a lot of helpful explanations and demo code.

In summary PPI are number of pixels per inch. Pixel is the smallest unit that can be displayed on a screen, the picture x element. So my screen has about 227 ppi which is super sharp while for historical reasons not the physical but a logical PPI of 96 is being assumed for screens. Continue reading DPI and PPI in scientific publications


CC-BY-NC

PHP update issues: Goodby miniflux, welcome freshRSS

With a forced upgrade of my provider from PHP 7.4 to PHP 8.1  miniflux v1  did not work anymore with Reeder 5. After trying to  correct all issues with phan, I decided to switch to a freshRSS install that has even a more modern Google API.

BTW I find it quite helpful to debug PHP 7.4 code by intentional throwing a javascript error to block any ongoing action with

throw "SimulatedException"

and setting PHP more verbose with

error_reporting(E_ALL | E_STRICT)

While it was quite straightforward to replace all “each” with “for each”, or “read_exif_date” with “exif_read_date” the main issue were unquoted “$_SERVER” variables and functions that need to be checked for missing variables

$rows= (is_countable($ret) ? count($ret) : 0

The biggest problems came by Nextcloud update to version 25. It turns out that there is a file limit of 262.144 of my internet provider.

Any Nextcloud update always needs about 30.000 files. Whenever it encounters any other error, a restart produces  another 30.000 files, basically killing my account.

In contrast to the documentation the issue is also not at the root nextcloud updater directory but in the data directory.

superfluous Nextcloud backup files hiding in the data directory

CC-BY-NC