subscribe to arXiv mailings

Advocating Character Error Rate for Multilingual ASR Evaluation

Authors: Thennal D K, Jesin James, Deepa P Gopinath, Muhammed Ashraf K

Abstract: Automatic speech recognition (ASR) systems have traditionally been evaluated using English datasets, with the word error rate (WER) serving as the predominant metric. WER's simplicity and ease of interpretation have contributed to its widespread adoption, particularly for English. However, as ASR systems expand to multilingual contexts, WER fails in various ways, particularly with morphologically… ▽ More Automatic speech recognition (ASR) systems have traditionally been evaluated using English datasets, with the word error rate (WER) serving as the predominant metric. WER's simplicity and ease of interpretation have contributed to its widespread adoption, particularly for English. However, as ASR systems expand to multilingual contexts, WER fails in various ways, particularly with morphologically complex languages or those without clear word boundaries. Our work documents the limitations of WER as an evaluation metric and advocates for the character error rate (CER) as the primary metric in multilingual ASR evaluation. We show that CER avoids many of the challenges WER faces and exhibits greater consistency across writing systems. We support our proposition by conducting human evaluations of ASR transcriptions in three languages: Malayalam, English, and Arabic, which exhibit distinct morphological characteristics. We show that CER correlates more closely with human judgments than WER, even for English. To facilitate further research, we release our human evaluation dataset for future benchmarking of ASR metrics. Our findings suggest that CER should be prioritized, or at least supplemented, in multilingual ASR evaluations to account for the varying linguistic characteristics of different languages. △ Less

Submitted 18 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

Comments: 4 pages

arXiv:2410.04981 [pdf, other]

On the Rigour of Scientific Writing: Criteria, Analysis, and Insights

Authors: Joseph James, Chenghao Xiao, Yucheng Li, Chenghua Lin

Abstract: Rigour is crucial for scientific research as it ensures the reproducibility and validity of results and findings. Despite its importance, little work exists on modelling rigour computationally, and there is a lack of analysis on whether these criteria can effectively signal or measure the rigour of scientific papers in practice. In this paper, we introduce a bottom-up, data-driven framework to aut… ▽ More Rigour is crucial for scientific research as it ensures the reproducibility and validity of results and findings. Despite its importance, little work exists on modelling rigour computationally, and there is a lack of analysis on whether these criteria can effectively signal or measure the rigour of scientific papers in practice. In this paper, we introduce a bottom-up, data-driven framework to automatically identify and define rigour criteria and assess their relevance in scientific writing. Our framework includes rigour keyword extraction, detailed rigour definition generation, and salient criteria identification. Furthermore, our framework is domain-agnostic and can be tailored to the evaluation of scientific rigour for different areas, accommodating the distinct salient criteria across fields. We conducted comprehensive experiments based on datasets collected from two high impact venues for Machine Learning and NLP (i.e., ICLR and ACL) to demonstrate the effectiveness of our framework in modelling rigour. In addition, we analyse linguistic patterns of rigour, revealing that framing certainty is crucial for enhancing the perception of scientific rigour, while suggestion certainty and probability uncertainty diminish it. △ Less

Submitted 7 October, 2024; originally announced October 2024.

Comments: Accepted Findings at EMNLP 2024

arXiv:2407.17416 [pdf, other]

Explaining Spectrograms in Machine Learning: A Study on Neural Networks for Speech Classification

Authors: Jesin James, Balamurali B. T., Binu Abeysinghe, Junchen Liu

Abstract: This study investigates discriminative patterns learned by neural networks for accurate speech classification, with a specific focus on vowel classification tasks. By examining the activations and features of neural networks for vowel classification, we gain insights into what the networks "see" in spectrograms. Through the use of class activation mapping, we identify the frequencies that contribu… ▽ More This study investigates discriminative patterns learned by neural networks for accurate speech classification, with a specific focus on vowel classification tasks. By examining the activations and features of neural networks for vowel classification, we gain insights into what the networks "see" in spectrograms. Through the use of class activation mapping, we identify the frequencies that contribute to vowel classification and compare these findings with linguistic knowledge. Experiments on a American English dataset of vowels showcases the explainability of neural networks and provides valuable insights into the causes of misclassifications and their characteristics when differentiating them from unvoiced speech. This study not only enhances our understanding of the underlying acoustic cues in vowel classification but also offers opportunities for improving speech recognition by bridging the gap between abstract representations in neural networks and established linguistic knowledge △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: 5th International Conference on Artificial Intelligence and Speech Technology (AIST-2023), New Delhi, India

arXiv:2405.16774 [pdf, ps, other]

Probabilistic Height Grid Terrain Mapping for Mining Shovels using LiDAR

Authors: Vedant Bhandari, Jasmin James, Tyson Phillips, P. Ross McAree

Abstract: This paper explores the question of creating and maintaining terrain maps in environments where the terrain changes. The specific example explored is the construction of terrain maps from 3D LiDAR measurements on an electric rope shovel. The approach extends the height grid representation of terrain to include a Hidden Markov Model in each cell, enabling confidence-based mapping of constantly chan… ▽ More This paper explores the question of creating and maintaining terrain maps in environments where the terrain changes. The specific example explored is the construction of terrain maps from 3D LiDAR measurements on an electric rope shovel. The approach extends the height grid representation of terrain to include a Hidden Markov Model in each cell, enabling confidence-based mapping of constantly changing terrain. There are inherent difficulties in this problem, including semantic labelling of the LiDAR measurements associated with machinery and determining the pose of the sensor. Solutions to both of these problems are explored. The significance of this work lies in the need for accurate terrain mapping to support autonomous machine operation. △ Less

Submitted 21 June, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

Comments: 6 pages, 5 figures

arXiv:2405.02556 [pdf, other]

Few-Shot Fruit Segmentation via Transfer Learning

Authors: Jordan A. James, Heather K. Manching, Amanda M. Hulse-Kemp, William J. Beksi

Abstract: Advancements in machine learning, computer vision, and robotics have paved the way for transformative solutions in various domains, particularly in agriculture. For example, accurate identification and segmentation of fruits from field images plays a crucial role in automating jobs such as harvesting, disease detection, and yield estimation. However, achieving robust and precise infield fruit segm… ▽ More Advancements in machine learning, computer vision, and robotics have paved the way for transformative solutions in various domains, particularly in agriculture. For example, accurate identification and segmentation of fruits from field images plays a crucial role in automating jobs such as harvesting, disease detection, and yield estimation. However, achieving robust and precise infield fruit segmentation remains a challenging task since large amounts of labeled data are required to handle variations in fruit size, shape, color, and occlusion. In this paper, we develop a few-shot semantic segmentation framework for infield fruits using transfer learning. Concretely, our work is aimed at addressing agricultural domains that lack publicly available labeled data. Motivated by similar success in urban scene parsing, we propose specialized pre-training using a public benchmark dataset for fruit transfer learning. By leveraging pre-trained neural networks, accurate semantic segmentation of fruit in the field is achieved with only a few labeled images. Furthermore, we show that models with pre-training learn to distinguish between fruit still on the trees and fruit that have fallen on the ground, and they can effectively transfer the knowledge to the target fruit dataset. △ Less

Submitted 4 May, 2024; originally announced May 2024.

Comments: To be published in the 2024 IEEE International Conference on Robotics and Automation (ICRA)

arXiv:2404.17038 [pdf, other]

Evaluating Collaborative Autonomy in Opposed Environments using Maritime Capture-the-Flag Competitions

Authors: Jordan Beason, Michael Novitzky, John Kliem, Tyler Errico, Zachary Serlin, Kevin Becker, Tyler Paine, Michael Benjamin, Prithviraj Dasgupta, Peter Crowley, Charles O'Donnell, John James

Abstract: The objective of this work is to evaluate multi-agent artificial intelligence methods when deployed on teams of unmanned surface vehicles (USV) in an adversarial environment. Autonomous agents were evaluated in real-world scenarios using the Aquaticus test-bed, which is a Capture-the-Flag (CTF) style competition involving teams of USV systems. Cooperative teaming algorithms of various foundations… ▽ More The objective of this work is to evaluate multi-agent artificial intelligence methods when deployed on teams of unmanned surface vehicles (USV) in an adversarial environment. Autonomous agents were evaluated in real-world scenarios using the Aquaticus test-bed, which is a Capture-the-Flag (CTF) style competition involving teams of USV systems. Cooperative teaming algorithms of various foundations in behavior-based optimization and deep reinforcement learning (RL) were deployed on these USV systems in two versus two teams and tested against each other during a competition period in the fall of 2023. Deep reinforcement learning applied to USV agents was achieved via the Pyquaticus test bed, a lightweight gymnasium environment that allows simulated CTF training in a low-level environment. The results of the experiment demonstrate that rule-based cooperation for behavior-based agents outperformed those trained in Deep-reinforcement learning paradigms as implemented in these competitions. Further integration of the Pyquaticus gymnasium environment for RL with MOOS-IvP in terms of configuration and control schema will allow for more competitive CTF games in future studies. As the development of experimental deep RL methods continues, the authors expect that the competitive gap between behavior-based autonomy and deep RL will be reduced. As such, this report outlines the overall competition, methods, and results with an emphasis on future works such as reward shaping and sim-to-real methodologies and extending rule-based cooperation among agents to react to safety and security events in accordance with human experts intent/rules for executing safety and security processes. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: Accepted to the IEEE ICRA Workshop on Field Robotics 2024

arXiv:2311.05452 [pdf, other]

Transformer-based Model for Oral Epithelial Dysplasia Segmentation

Authors: Adam J Shephard, Hanya Mahmood, Shan E Ahmed Raza, Anna Luiza Damaceno Araujo, Alan Roger Santos-Silva, Marcio Ajudarte Lopes, Pablo Agustin Vargas, Kris McCombe, Stephanie Craig, Jacqueline James, Jill Brooks, Paul Nankivell, Hisham Mehanna, Syed Ali Khurram, Nasir M Rajpoot

Abstract: Oral epithelial dysplasia (OED) is a premalignant histopathological diagnosis given to lesions of the oral cavity. OED grading is subject to large inter/intra-rater variability, resulting in the under/over-treatment of patients. We developed a new Transformer-based pipeline to improve detection and segmentation of OED in haematoxylin and eosin (H&E) stained whole slide images (WSIs). Our model was… ▽ More Oral epithelial dysplasia (OED) is a premalignant histopathological diagnosis given to lesions of the oral cavity. OED grading is subject to large inter/intra-rater variability, resulting in the under/over-treatment of patients. We developed a new Transformer-based pipeline to improve detection and segmentation of OED in haematoxylin and eosin (H&E) stained whole slide images (WSIs). Our model was trained on OED cases (n = 260) and controls (n = 105) collected using three different scanners, and validated on test data from three external centres in the United Kingdom and Brazil (n = 78). Our internal experiments yield a mean F1-score of 0.81 for OED segmentation, which reduced slightly to 0.71 on external testing, showing good generalisability, and gaining state-of-the-art results. This is the first externally validated study to use Transformers for segmentation in precancerous histology images. Our publicly available model shows great promise to be the first step of a fully-integrated pipeline, allowing earlier and more efficient OED diagnosis, ultimately benefiting patient outcomes. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: 5 pages, 2 figures, 4 tables

arXiv:2309.15217 [pdf, other]

RAGAS: Automated Evaluation of Retrieval Augmented Generation

Authors: Shahul Es, Jithin James, Luis Espinosa-Anke, Steven Schockaert

Abstract: We introduce RAGAs (Retrieval Augmented Generation Assessment), a framework for reference-free evaluation of Retrieval Augmented Generation (RAG) pipelines. RAG systems are composed of a retrieval and an LLM based generation module, and provide LLMs with knowledge from a reference textual database, which enables them to act as a natural language layer between a user and textual databases, reducing… ▽ More We introduce RAGAs (Retrieval Augmented Generation Assessment), a framework for reference-free evaluation of Retrieval Augmented Generation (RAG) pipelines. RAG systems are composed of a retrieval and an LLM based generation module, and provide LLMs with knowledge from a reference textual database, which enables them to act as a natural language layer between a user and textual databases, reducing the risk of hallucinations. Evaluating RAG architectures is, however, challenging because there are several dimensions to consider: the ability of the retrieval system to identify relevant and focused context passages, the ability of the LLM to exploit such passages in a faithful way, or the quality of the generation itself. With RAGAs, we put forward a suite of metrics which can be used to evaluate these different dimensions \textit{without having to rely on ground truth human annotations}. We posit that such a framework can crucially contribute to faster evaluation cycles of RAG architectures, which is especially important given the fast adoption of LLMs. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: Reference-free (not tied to having ground truth available) evaluation framework for retrieval agumented generation

arXiv:2309.05645 [pdf, other]

CitDet: A Benchmark Dataset for Citrus Fruit Detection

Authors: Jordan A. James, Heather K. Manching, Matthew R. Mattia, Kim D. Bowman, Amanda M. Hulse-Kemp, William J. Beksi

Abstract: In this letter, we present a new dataset to advance the state of the art in detecting citrus fruit and accurately estimate yield on trees affected by the Huanglongbing (HLB) disease in orchard environments via imaging. Despite the fact that significant progress has been made in solving the fruit detection problem, the lack of publicly available datasets has complicated direct comparison of results… ▽ More In this letter, we present a new dataset to advance the state of the art in detecting citrus fruit and accurately estimate yield on trees affected by the Huanglongbing (HLB) disease in orchard environments via imaging. Despite the fact that significant progress has been made in solving the fruit detection problem, the lack of publicly available datasets has complicated direct comparison of results. For instance, citrus detection has long been of interest to the agricultural research community, yet there is an absence of work, particularly involving public datasets of citrus affected by HLB. To address this issue, we enhance state-of-the-art object detection methods for use in typical orchard settings. Concretely, we provide high-resolution images of citrus trees located in an area known to be highly affected by HLB, along with high-quality bounding box annotations of citrus fruit. Fruit on both the trees and the ground are labeled to allow for identification of fruit location, which contributes to advancements in yield estimation and potential measure of HLB impact via fruit drop. The dataset consists of over 32,000 bounding box annotations for fruit instances contained in 579 high-resolution images. In summary, our contributions are the following: (i) we introduce a novel dataset along with baseline performance benchmarks on multiple contemporary object detection algorithms, (ii) we show the ability to accurately capture fruit location on tree or on ground, and finally (ii) we present a correlation of our results with yield estimations. △ Less

Submitted 9 October, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

Comments: To be published in IEEE Robotics and Automation Letters (RA-L)

arXiv:2307.03757 [pdf]

A Fully Automated and Explainable Algorithm for the Prediction of Malignant Transformation in Oral Epithelial Dysplasia

Authors: Adam J Shephard, Raja Muhammad Saad Bashir, Hanya Mahmood, Mostafa Jahanifar, Fayyaz Minhas, Shan E Ahmed Raza, Kris D McCombe, Stephanie G Craig, Jacqueline James, Jill Brooks, Paul Nankivell, Hisham Mehanna, Syed Ali Khurram, Nasir M Rajpoot

Abstract: Oral epithelial dysplasia (OED) is a premalignant histopathological diagnosis given to lesions of the oral cavity. Its grading suffers from significant inter-/intra- observer variability, and does not reliably predict malignancy progression, potentially leading to suboptimal treatment decisions. To address this, we developed a novel artificial intelligence algorithm that can assign an Oral Maligna… ▽ More Oral epithelial dysplasia (OED) is a premalignant histopathological diagnosis given to lesions of the oral cavity. Its grading suffers from significant inter-/intra- observer variability, and does not reliably predict malignancy progression, potentially leading to suboptimal treatment decisions. To address this, we developed a novel artificial intelligence algorithm that can assign an Oral Malignant Transformation (OMT) risk score, based on histological patterns in the in Haematoxylin and Eosin stained whole slide images, to quantify the risk of OED progression. The algorithm is based on the detection and segmentation of nuclei within (and around) the epithelium using an in-house segmentation model. We then employed a shallow neural network fed with interpretable morphological/spatial features, emulating histological markers. We conducted internal cross-validation on our development cohort (Sheffield; n = 193 cases) followed by independent validation on two external cohorts (Birmingham and Belfast; n = 92 cases). The proposed OMTscore yields an AUROC = 0.74 in predicting whether an OED progresses to malignancy or not. Survival analyses showed the prognostic value of our OMTscore for predicting malignancy transformation, when compared to the manually-assigned WHO and binary grades. Analysis of the correctly predicted cases elucidated the presence of peri-epithelial and epithelium-infiltrating lymphocytes in the most predictive patches of cases that transformed (p < 0.0001). This is the first study to propose a completely automated algorithm for predicting OED transformation based on interpretable nuclear features, whilst being validated on external datasets. The algorithm shows better-than-human-level performance for prediction of OED malignant transformation and offers a promising solution to the challenges of grading OED in routine clinical practice. △ Less

Submitted 6 July, 2023; originally announced July 2023.

arXiv:2301.09617 [pdf, other]

Fully transformer-based biomarker prediction from colorectal cancer histology: a large-scale multicentric study

Authors: Sophia J. Wagner, Daniel Reisenbüchler, Nicholas P. West, Jan Moritz Niehues, Gregory Patrick Veldhuizen, Philip Quirke, Heike I. Grabsch, Piet A. van den Brandt, Gordon G. A. Hutchins, Susan D. Richman, Tanwei Yuan, Rupert Langer, Josien Christina Anna Jenniskens, Kelly Offermans, Wolfram Mueller, Richard Gray, Stephen B. Gruber, Joel K. Greenson, Gad Rennert, Joseph D. Bonner, Daniel Schmolze, Jacqueline A. James, Maurice B. Loughrey, Manuel Salto-Tellez, Hermann Brenner , et al. (6 additional authors not shown)

Abstract: Background: Deep learning (DL) can extract predictive and prognostic biomarkers from routine pathology slides in colorectal cancer. For example, a DL test for the diagnosis of microsatellite instability (MSI) in CRC has been approved in 2022. Current approaches rely on convolutional neural networks (CNNs). Transformer networks are outperforming CNNs and are replacing them in many applications, but… ▽ More Background: Deep learning (DL) can extract predictive and prognostic biomarkers from routine pathology slides in colorectal cancer. For example, a DL test for the diagnosis of microsatellite instability (MSI) in CRC has been approved in 2022. Current approaches rely on convolutional neural networks (CNNs). Transformer networks are outperforming CNNs and are replacing them in many applications, but have not been used for biomarker prediction in cancer at a large scale. In addition, most DL approaches have been trained on small patient cohorts, which limits their clinical utility. Methods: In this study, we developed a new fully transformer-based pipeline for end-to-end biomarker prediction from pathology slides. We combine a pre-trained transformer encoder and a transformer network for patch aggregation, capable of yielding single and multi-target prediction at patient level. We train our pipeline on over 9,000 patients from 10 colorectal cancer cohorts. Results: A fully transformer-based approach massively improves the performance, generalizability, data efficiency, and interpretability as compared with current state-of-the-art algorithms. After training on a large multicenter cohort, we achieve a sensitivity of 0.97 with a negative predictive value of 0.99 for MSI prediction on surgical resection specimens. We demonstrate for the first time that resection specimen-only training reaches clinical-grade performance on endoscopic biopsy tissue, solving a long-standing diagnostic problem. Interpretation: A fully transformer-based end-to-end pipeline trained on thousands of pathology slides yields clinical-grade performance for biomarker prediction on surgical resections and biopsies. Our new methods are freely available under an open source license. △ Less

Submitted 1 March, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

Comments: Updated Figure 2 and Table A.5

arXiv:2301.00049 [pdf]

Multi-Finger Haptics: Analysis of Human Hand Grasp towards a Tripod Three-Finger Haptic Grasp model

Authors: Jose James

Abstract: Grasping is an incredible ability of animals using their arms and limbs in their daily life. The human hand is an especially astonishing multi-fingered tool for precise grasping, which helped humans to develop the modern world. The implementation of the human grasp to virtual reality and telerobotics is always interesting and challenging at the same time. In this work, authors surveyed, studied, a… ▽ More Grasping is an incredible ability of animals using their arms and limbs in their daily life. The human hand is an especially astonishing multi-fingered tool for precise grasping, which helped humans to develop the modern world. The implementation of the human grasp to virtual reality and telerobotics is always interesting and challenging at the same time. In this work, authors surveyed, studied, and analyzed the human hand-grasping behavior for the possibilities of haptic grasping in the virtual and remote environment. This work is focused on the motion and force analysis of fingers in human hand grasping scenarios and the paper describes the transition of the human hand grasping towards a tripod haptic grasp model for effective interaction in virtual reality. △ Less

Submitted 30 December, 2022; originally announced January 2023.

Journal ref: Sensor 2022

arXiv:2210.13769 [pdf, other]

GlobalFlowNet: Video Stabilization using Deep Distilled Global Motion Estimates

Authors: Jerin Geo James, Devansh Jain, Ajit Rajwade

Abstract: Videos shot by laymen using hand-held cameras contain undesirable shaky motion. Estimating the global motion between successive frames, in a manner not influenced by moving objects, is central to many video stabilization techniques, but poses significant challenges. A large body of work uses 2D affine transformations or homography for the global motion. However, in this work, we introduce a more g… ▽ More Videos shot by laymen using hand-held cameras contain undesirable shaky motion. Estimating the global motion between successive frames, in a manner not influenced by moving objects, is central to many video stabilization techniques, but poses significant challenges. A large body of work uses 2D affine transformations or homography for the global motion. However, in this work, we introduce a more general representation scheme, which adapts any existing optical flow network to ignore the moving objects and obtain a spatially smooth approximation of the global motion between video frames. We achieve this by a knowledge distillation approach, where we first introduce a low pass filter module into the optical flow network to constrain the predicted optical flow to be spatially smooth. This becomes our student network, named as \textsc{GlobalFlowNet}. Then, using the original optical flow network as the teacher network, we train the student network using a robust loss function. Given a trained \textsc{GlobalFlowNet}, we stabilize videos using a two stage process. In the first stage, we correct the instability in affine parameters using a quadratic programming approach constrained by a user-specified cropping limit to control loss of field of view. In the second stage, we stabilize the video further by smoothing global motion parameters, expressed using a small number of discrete cosine transform coefficients. In extensive experiments on a variety of different videos, our technique outperforms state of the art techniques in terms of subjective quality and different quantitative measures of video stability. The source code is publicly available at \href{https://github.com/GlobalFlowNet/GlobalFlowNet}{https://github.com/GlobalFlowNet/GlobalFlowNet} △ Less

Submitted 4 November, 2022; v1 submitted 25 October, 2022; originally announced October 2022.

Comments: Accepted in WACV 2023

arXiv:2208.11278 [pdf, other]

Federated Self-Supervised Contrastive Learning and Masked Autoencoder for Dermatological Disease Diagnosis

Authors: Yawen Wu, Dewen Zeng, Zhepeng Wang, Yi Sheng, Lei Yang, Alaina J. James, Yiyu Shi, Jingtong Hu

Abstract: In dermatological disease diagnosis, the private data collected by mobile dermatology assistants exist on distributed mobile devices of patients. Federated learning (FL) can use decentralized data to train models while keeping data local. Existing FL methods assume all the data have labels. However, medical data often comes without full labels due to high labeling costs. Self-supervised learning (… ▽ More In dermatological disease diagnosis, the private data collected by mobile dermatology assistants exist on distributed mobile devices of patients. Federated learning (FL) can use decentralized data to train models while keeping data local. Existing FL methods assume all the data have labels. However, medical data often comes without full labels due to high labeling costs. Self-supervised learning (SSL) methods, contrastive learning (CL) and masked autoencoders (MAE), can leverage the unlabeled data to pre-train models, followed by fine-tuning with limited labels. However, combining SSL and FL has unique challenges. For example, CL requires diverse data but each device only has limited data. For MAE, while Vision Transformer (ViT) based MAE has higher accuracy over CNNs in centralized learning, MAE's performance in FL with unlabeled data has not been investigated. Besides, the ViT synchronization between the server and clients is different from traditional CNNs. Therefore, special synchronization methods need to be designed. In this work, we propose two federated self-supervised learning frameworks for dermatological disease diagnosis with limited labels. The first one features lower computation costs, suitable for mobile devices. The second one features high accuracy and fits high-performance servers. Based on CL, we proposed federated contrastive learning with feature sharing (FedCLF). Features are shared for diverse contrastive information without sharing raw data for privacy. Based on MAE, we proposed FedMAE. Knowledge split separates the global and local knowledge learned from each client. Only global knowledge is aggregated for higher generalization performance. Experiments on dermatological disease datasets show superior accuracy of the proposed frameworks over state-of-the-arts. △ Less

Submitted 23 August, 2022; originally announced August 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2202.07470

arXiv:2208.09778 [pdf, other]

The Development of a Labelled te reo Māori-English Bilingual Database for Language Technology

Authors: Jesin James, Isabella Shields, Vithya Yogarajan, Peter J. Keegan, Catherine Watson, Peter-Lucas Jones, Keoni Mahelona

Abstract: Te reo Māori (referred to as Māori), New Zealand's indigenous language, is under-resourced in language technology. Māori speakers are bilingual, where Māori is code-switched with English. Unfortunately, there are minimal resources available for Māori language technology, language detection and code-switch detection between Māori-English pair. Both English and Māori use Roman-derived orthography ma… ▽ More Te reo Māori (referred to as Māori), New Zealand's indigenous language, is under-resourced in language technology. Māori speakers are bilingual, where Māori is code-switched with English. Unfortunately, there are minimal resources available for Māori language technology, language detection and code-switch detection between Māori-English pair. Both English and Māori use Roman-derived orthography making rule-based systems for detecting language and code-switching restrictive. Most Māori language detection is done manually by language experts. This research builds a Māori-English bilingual database of 66,016,807 words with word-level language annotation. The New Zealand Parliament Hansard debates reports were used to build the database. The language labels are assigned using language-specific rules and expert manual annotations. Words with the same spelling, but different meanings, exist for Māori and English. These words could not be categorised as Māori or English based on word-level language rules. Hence, manual annotations were necessary. An analysis reporting the various aspects of the database such as metadata, year-wise analysis, frequently occurring words, sentence length and N-grams is also reported. The database developed here is a valuable tool for future language and speech technology development for Aotearoa New Zealand. The methodology followed to label the database can also be followed by other low-resourced language pairs. △ Less

Submitted 20 August, 2022; originally announced August 2022.

Comments: Submitted to Springer Language Resources and Evaluation Journal 2022

arXiv:2208.09775 [pdf, other]

Visualising Model Training via Vowel Space for Text-To-Speech Systems

Authors: Binu Abeysinghe, Jesin James, Catherine I. Watson, Felix Marattukalam

Abstract: With the recent developments in speech synthesis via machine learning, this study explores incorporating linguistics knowledge to visualise and evaluate synthetic speech model training. If changes to the first and second formant (in turn, the vowel space) can be seen and heard in synthetic speech, this knowledge can inform speech synthesis technology developers. A speech synthesis model trained on… ▽ More With the recent developments in speech synthesis via machine learning, this study explores incorporating linguistics knowledge to visualise and evaluate synthetic speech model training. If changes to the first and second formant (in turn, the vowel space) can be seen and heard in synthetic speech, this knowledge can inform speech synthesis technology developers. A speech synthesis model trained on a large General American English database was fine-tuned into a New Zealand English voice to identify if the changes in the vowel space of synthetic speech could be seen and heard. The vowel spaces at different intervals during the fine-tuning were analysed to determine if the model learned the New Zealand English vowel space. Our findings based on vowel space analysis show that we can visualise how a speech synthesis model learns the vowel space of the database it is trained on. Perception tests confirmed that humans could perceive when a speech synthesis model has learned characteristics of the speech database it is training on. Using the vowel space as an intermediary evaluation helps understand what sounds are to be added to the training database and build speech synthesis models based on linguistics knowledge. △ Less

Submitted 20 August, 2022; originally announced August 2022.

Comments: Accepted to Interspeech 2022

arXiv:2206.11520 [pdf, other]

ICOS Protein Expression Segmentation: Can Transformer Networks Give Better Results?

Authors: Vivek Kumar Singh, Paul O Reilly, Jacqueline James, Manuel Salto Tellez, Perry Maxwell

Abstract: Biomarkers identify a patients response to treatment. With the recent advances in artificial intelligence based on the Transformer networks, there is only limited research has been done to measure the performance on challenging histopathology images. In this paper, we investigate the efficacy of the numerous state-of-the-art Transformer networks for immune-checkpoint biomarker, Inducible Tcell COS… ▽ More Biomarkers identify a patients response to treatment. With the recent advances in artificial intelligence based on the Transformer networks, there is only limited research has been done to measure the performance on challenging histopathology images. In this paper, we investigate the efficacy of the numerous state-of-the-art Transformer networks for immune-checkpoint biomarker, Inducible Tcell COStimulator (ICOS) protein cell segmentation in colon cancer from immunohistochemistry (IHC) slides. Extensive and comprehensive experimental results confirm that MiSSFormer achieved the highest Dice score of 74.85% than the rest evaluated Transformer and Efficient U-Net methods. △ Less

Submitted 23 June, 2022; originally announced June 2022.

Comments: Accepted MIUA conference (Abstract short paper)

arXiv:2205.01167 [pdf]

3D Convolutional Neural Networks for Dendrite Segmentation Using Fine-Tuning and Hyperparameter Optimization

Authors: Jim James, Nathan Pruyne, Tiberiu Stan, Marcus Schwarting, Jiwon Yeom, Seungbum Hong, Peter Voorhees, Ben Blaiszik, Ian Foster

Abstract: Dendritic microstructures are ubiquitous in nature and are the primary solidification morphologies in metallic materials. Techniques such as x-ray computed tomography (XCT) have provided new insights into dendritic phase transformation phenomena. However, manual identification of dendritic morphologies in microscopy data can be both labor intensive and potentially ambiguous. The analysis of 3D dat… ▽ More Dendritic microstructures are ubiquitous in nature and are the primary solidification morphologies in metallic materials. Techniques such as x-ray computed tomography (XCT) have provided new insights into dendritic phase transformation phenomena. However, manual identification of dendritic morphologies in microscopy data can be both labor intensive and potentially ambiguous. The analysis of 3D datasets is particularly challenging due to their large sizes (terabytes) and the presence of artifacts scattered within the imaged volumes. In this study, we trained 3D convolutional neural networks (CNNs) to segment 3D datasets. Three CNN architectures were investigated, including a new 3D version of FCDense. We show that using hyperparameter optimization (HPO) and fine-tuning techniques, both 2D and 3D CNN architectures can be trained to outperform the previous state of the art. The 3D U-Net architecture trained in this study produced the best segmentations according to quantitative metrics (pixel-wise accuracy of 99.84% and a boundary displacement error of 0.58 pixels), while 3D FCDense produced the smoothest boundaries and best segmentations according to visual inspection. The trained 3D CNNs are able to segment entire 852 x 852 x 250 voxel 3D volumes in only ~60 seconds, thus hastening the progress towards a deeper understanding of phase transformation phenomena such as dendritic solidification. △ Less

Submitted 2 May, 2022; originally announced May 2022.

arXiv:2202.07470 [pdf, other]

Federated Contrastive Learning for Dermatological Disease Diagnosis via On-device Learning

Authors: Yawen Wu, Dewen Zeng, Zhepeng Wang, Yi Sheng, Lei Yang, Alaina J. James, Yiyu Shi, Jingtong Hu

Abstract: Deep learning models have been deployed in an increasing number of edge and mobile devices to provide healthcare. These models rely on training with a tremendous amount of labeled data to achieve high accuracy. However, for medical applications such as dermatological disease diagnosis, the private data collected by mobile dermatology assistants exist on distributed mobile devices of patients, and… ▽ More Deep learning models have been deployed in an increasing number of edge and mobile devices to provide healthcare. These models rely on training with a tremendous amount of labeled data to achieve high accuracy. However, for medical applications such as dermatological disease diagnosis, the private data collected by mobile dermatology assistants exist on distributed mobile devices of patients, and each device only has a limited amount of data. Directly learning from limited data greatly deteriorates the performance of learned models. Federated learning (FL) can train models by using data distributed on devices while keeping the data local for privacy. Existing works on FL assume all the data have ground-truth labels. However, medical data often comes without any accompanying labels since labeling requires expertise and results in prohibitively high labor costs. The recently developed self-supervised learning approach, contrastive learning (CL), can leverage the unlabeled data to pre-train a model, after which the model is fine-tuned on limited labeled data for dermatological disease diagnosis. However, simply combining CL with FL as federated contrastive learning (FCL) will result in ineffective learning since CL requires diverse data for learning but each device only has limited data. In this work, we propose an on-device FCL framework for dermatological disease diagnosis with limited labels. Features are shared in the FCL pre-training process to provide diverse and accurate contrastive information. After that, the pre-trained model is fine-tuned with local labeled data independently on each device or collaboratively with supervised federated learning on all devices. Experiments on dermatological disease datasets show that the proposed framework effectively improves the recall and precision of dermatological disease diagnosis compared with state-of-the-art methods. △ Less

Submitted 13 February, 2022; originally announced February 2022.

arXiv:2106.16158 [pdf, other]

doi 10.1109/BRAINS52497.2021.9569833

Jack The Rippler: Arbitrage on the Decentralized Exchange of the XRP Ledger

Authors: Gaspard Peduzzi, Jason James, Jiahua Xu

Abstract: The XRP Ledger (XRPL) is a peer-to-peer cryptographic ledger. It features a decentralized exchange (DEX) where network participants can issue and trade user-defined digital assets and currencies. We present Jack the Rippler, a bot that identifies and exploits arbitrage opportunities on the XRPL DEX. We describe Jack's arbitrage process and discuss risks associated with using arbitrage bots. The XRP Ledger (XRPL) is a peer-to-peer cryptographic ledger. It features a decentralized exchange (DEX) where network participants can issue and trade user-defined digital assets and currencies. We present Jack the Rippler, a bot that identifies and exploits arbitrage opportunities on the XRPL DEX. We describe Jack's arbitrage process and discuss risks associated with using arbitrage bots. △ Less

Submitted 9 November, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

Journal ref: 3rd Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS) (2021)

arXiv:2012.05928 [pdf, other]

doi 10.1093/mnras/stab164

A machine learning approach to galaxy properties: joint redshift-stellar mass probability distributions with Random Forest

Authors: S. Mucesh, W. G. Hartley, A. Palmese, O. Lahav, L. Whiteway, A. F. L. Bluck, A. Alarcon, A. Amon, K. Bechtol, G. M. Bernstein, A. Carnero Rosell, M. Carrasco Kind, A. Choi, K. Eckert, S. Everett, D. Gruen, R. A. Gruendl, I. Harrison, E. M. Huff, N. Kuropatkin, I. Sevilla-Noarbe, E. Sheldon, B. Yanny, M. Aguena, S. Allam , et al. (50 additional authors not shown)

Abstract: We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep phot… ▽ More We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep photometry in the $griz$ bands, and the second reflecting the photometric scatter present in the main DES survey, with carefully constructed representative training data in each case. We validate our joint PDFs for $10,699$ test galaxies by utilizing the copula probability integral transform and the Kendall distribution function, and their univariate counterparts to validate the marginals. Benchmarked against a basic set-up of the template-fitting code BAGPIPES, our ML-based method outperforms template fitting on all of our predefined performance metrics. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under $6$ min with consumer computer hardware. Such speed enables PDFs to be derived in real time within analysis codes, solving potential storage issues. As part of this work we have developed GALPRO, a highly intuitive and efficient Python package to rapidly generate multivariate PDFs on-the-fly. GALPRO is documented and available for researchers to use in their cosmology and galaxy evolution studies. △ Less

Submitted 19 February, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

Comments: 18 pages, 8 figures, Accepted by MNRAS

Report number: FERMILAB-PUB-20-653-AE, DES-2020-0542

Journal ref: Monthly Notices of the Royal Astronomical Society, Volume 502, Issue 2, April 2021, Pages 2770-2786

arXiv:2010.01928 [pdf, other]

Slip detection for grasp stabilisation with a multi-fingered tactile robot hand

Authors: Jasper W. James, Nathan F. Lepora

Abstract: Tactile sensing is used by humans when grasping to prevent us dropping objects. One key facet of tactile sensing is slip detection, which allows a gripper to know when a grasp is failing and take action to prevent an object being dropped. This study demonstrates the slip detection capabilities of the recently developed Tactile Model O (T-MO) by using support vector machines to detect slip and test… ▽ More Tactile sensing is used by humans when grasping to prevent us dropping objects. One key facet of tactile sensing is slip detection, which allows a gripper to know when a grasp is failing and take action to prevent an object being dropped. This study demonstrates the slip detection capabilities of the recently developed Tactile Model O (T-MO) by using support vector machines to detect slip and test multiple slip scenarios including responding to the onset of slip in real time with eleven different objects in various grasps. We demonstrate the benefits of slip detection in grasping by testing two real-world scenarios: adding weight to destabilise a grasp and using slip detection to lift up objects at the first attempt. The T-MO is able to detect when an object is slipping, react to stabilise the grasp and be deployed in real-world scenarios. This shows the T-MO is a suitable platform for autonomous grasping by using reliable slip detection to ensure a stable grasp in unstructured environments. Supplementary video: https://youtu.be/wOwFHaiHuKY △ Less

Submitted 5 October, 2020; originally announced October 2020.

Comments: Accepted into IEEE Transactions on Robotics

arXiv:2009.12856 [pdf, other]

doi 10.1088/1538-3873/abcaea

Machine Learning for Searching the Dark Energy Survey for Trans-Neptunian Objects

Authors: B. Henghes, O. Lahav, D. W. Gerdes, E. Lin, R. Morgan, T. M. C. Abbott, M. Aguena, S. Allam, J. Annis, S. Avila, E. Bertin, D. Brooks, D. L. Burke, A. CarneroRosell, M. CarrascoKind, J. Carretero, C. Conselice, M. Costanzi, L. N. da Costa, J. DeVicente, S. Desai, H. T. Diehl, P. Doel, S. Everett, I. Ferrero , et al. (34 additional authors not shown)

Abstract: In this paper we investigate how implementing machine learning could improve the efficiency of the search for Trans-Neptunian Objects (TNOs) within Dark Energy Survey (DES) data when used alongside orbit fitting. The discovery of multiple TNOs that appear to show a similarity in their orbital parameters has led to the suggestion that one or more undetected planets, an as yet undiscovered "Planet 9… ▽ More In this paper we investigate how implementing machine learning could improve the efficiency of the search for Trans-Neptunian Objects (TNOs) within Dark Energy Survey (DES) data when used alongside orbit fitting. The discovery of multiple TNOs that appear to show a similarity in their orbital parameters has led to the suggestion that one or more undetected planets, an as yet undiscovered "Planet 9", may be present in the outer Solar System. DES is well placed to detect such a planet and has already been used to discover many other TNOs. Here, we perform tests on eight different supervised machine learning algorithms, using a dataset consisting of simulated TNOs buried within real DES noise data. We found that the best performing classifier was the Random Forest which, when optimised, performed well at detecting the rare objects. We achieve an area under the receiver operating characteristic (ROC) curve, (AUC) $= 0.996 \pm 0.001$. After optimizing the decision threshold of the Random Forest, we achieve a recall of 0.96 while maintaining a precision of 0.80. Finally, by using the optimized classifier to pre-select objects, we are able to run the orbit-fitting stage of our detection pipeline five times faster. △ Less

Submitted 10 December, 2020; v1 submitted 27 September, 2020; originally announced September 2020.

Comments: Published in PASP, 16 pages, 6 figures

Journal ref: PASP 133 014501 (2021)

arXiv:2009.00150 [pdf, ps, other]

Exactly Optimal Bayesian Quickest Change Detection for Hidden Markov Models

Authors: Jason J. Ford, Jasmin James, Timothy L. Molloy

Abstract: This paper considers the quickest detection problem for hidden Markov models (HMMs) in a Bayesian setting. We construct an augmented HMM representation of the problem that allows the application of a dynamic programming approach to prove that Shiryaev's rule is an (exact) optimal solution. This augmented representation highlights the problem's fundamental information structure and suggests possibl… ▽ More This paper considers the quickest detection problem for hidden Markov models (HMMs) in a Bayesian setting. We construct an augmented HMM representation of the problem that allows the application of a dynamic programming approach to prove that Shiryaev's rule is an (exact) optimal solution. This augmented representation highlights the problem's fundamental information structure and suggests possible relaxations to more exotic change event priors not appearing in the literature. Finally, this augmented representation also allows us to present an efficient computational method for implementing the optimal solution. △ Less

Submitted 15 March, 2023; v1 submitted 31 August, 2020; originally announced September 2020.

arXiv:2008.06904 [pdf, other]

A Biomimetic Tactile Fingerprint Induces Incipient Slip

Authors: Jasper W. James, Stephen J. Redmond, Nathan F. Lepora

Abstract: We present a modified TacTip biomimetic optical tactile sensor design which demonstrates the ability to induce and detect incipient slip, as confirmed by recording the movement of markers on the sensor's external surface. Incipient slip is defined as slippage of part, but not all, of the contact surface between the sensor and object. The addition of ridges - which mimic the friction ridges in the… ▽ More We present a modified TacTip biomimetic optical tactile sensor design which demonstrates the ability to induce and detect incipient slip, as confirmed by recording the movement of markers on the sensor's external surface. Incipient slip is defined as slippage of part, but not all, of the contact surface between the sensor and object. The addition of ridges - which mimic the friction ridges in the human fingertip - in a concentric ring pattern allowed for localised shear deformation to occur on the sensor surface for a significant duration prior to the onset of gross slip. By detecting incipient slip we were able to predict when several differently shaped objects were at risk of falling and prevent them from doing so. Detecting incipient slip is useful because a corrective action can be taken before slippage occurs across the entire contact area thus minimising the risk of objects been dropped. △ Less

Submitted 16 August, 2020; originally announced August 2020.

Comments: Accepted into IROS 2020

arXiv:2001.04574 [pdf]

Preliminary Study of a Google Home Mini

Authors: Min Jin Park, Joshua I. James

Abstract: Many artificial intelligence (AI) speakers have recently come to market. Beginning with Amazon Echo, many companies producing their own speaker technologies. Due to the limitations of technology, most speakers have similar functions, but the way of handling the data of each speaker is different. In the case of Amazon echo, the API of the cloud is open for any developers to develop their API. The A… ▽ More Many artificial intelligence (AI) speakers have recently come to market. Beginning with Amazon Echo, many companies producing their own speaker technologies. Due to the limitations of technology, most speakers have similar functions, but the way of handling the data of each speaker is different. In the case of Amazon echo, the API of the cloud is open for any developers to develop their API. The Amazon Echo has been around for a while, and much research has been done on it. However, not much research has been done on Google Home Mini analysis for digital investigations. In this paper, we will conduct some initial research on the data storing and security methods of Google Home Mini. △ Less

Submitted 13 January, 2020; originally announced January 2020.

Comments: 12 pages, 6 figures, 3 tables

Journal ref: Journal of Digital Forensics 13-3: 163-174 (2019). https://kdfs.jams.or.kr/jams/download/KCI_FI002513079.pdf

arXiv:2001.02320 [pdf, other]

RoboFly: An insect-sized robot with simplified fabrication that is capable of flight, ground, and water surface locomotion

Authors: Yogesh M Chukewad, Johannes James, Avinash Singh, Sawyer Fuller

Abstract: Aerial robots the size of a honeybee (~100 mg) have advantages over larger robots because of their small size, low mass and low materials cost. Previous iterations have demonstrated controlled flight but were difficult to fabricate because they consisted of many separate parts assembled together. They also were unable to perform locomotion modes besides flight. This paper presents a new design of… ▽ More Aerial robots the size of a honeybee (~100 mg) have advantages over larger robots because of their small size, low mass and low materials cost. Previous iterations have demonstrated controlled flight but were difficult to fabricate because they consisted of many separate parts assembled together. They also were unable to perform locomotion modes besides flight. This paper presents a new design of a 74 mg flapping-wing robot that dramatically reduces the number of parts and simplifies fabrication. It also has a lower center of mass, which allows the robot to additionally land without the need for long legs, even in case of unstable flight. Furthermore, we show that the new design allows for wing-driven ground and air-water interfacial locomotion, improving the versatility of the robot. Forward thrust is generated by increasing the speed of downstroke relative to the upstroke of the flapping wings. This also allows for steering. The ability to land and subsequently move along the ground allows the robot to negotiate extremely confined spaces, underneath obstacles, and to precise locations. We describe the new design in detail and present results demonstrating these capabilities, as well as hovering flight and controlled landing. △ Less

Submitted 25 October, 2020; v1 submitted 7 January, 2020; originally announced January 2020.

Comments: 15 pages. Submitted to IEEE Transactions on Robotics (T-RO)

arXiv:2001.00301 [pdf]

doi 10.7236/JIIBC.2019.19.6.15

A Feature Comparison of Modern Digital Forensic Imaging Software

Authors: Jiyoon Ham, Joshua I. James

Abstract: Fundamental processes in digital forensic investigation, such as disk imaging, were developed when digital investigation was relatively young. As digital forensic processes and procedures matured, these fundamental tools, that are the pillars of the reset of the data processing and analysis phases of an investigation, largely stayed the same. This work is a study of modern digital forensic imaging… ▽ More Fundamental processes in digital forensic investigation, such as disk imaging, were developed when digital investigation was relatively young. As digital forensic processes and procedures matured, these fundamental tools, that are the pillars of the reset of the data processing and analysis phases of an investigation, largely stayed the same. This work is a study of modern digital forensic imaging software tools. Specifically, we will examine the feature sets of modern digital forensic imaging tools, as well as their development and release cycles to understand patterns of fundamental tool development. Based on this survey, we show the weakness in current digital investigation fundamental software development and maintenance over time. We also provide recommendations on how to improve fundamental tools. △ Less

Submitted 1 January, 2020; originally announced January 2020.

Comments: 6 pages, 1 figure

Journal ref: The Journal of The Institute of Internet, Broadcasting and Communication 19-6: 15-20 (2019)

arXiv:1912.10822 [pdf, other]

DeepHashing using TripletLoss

Authors: Jithin James

Abstract: Hashing is one of the most efficient techniques for approximate nearest neighbour search for large scale image retrieval. Most of the techniques are based on hand-engineered features and do not give optimal results all the time. Deep Convolutional Neural Networks have proven to generate very effective representation of images that are used for various computer vision tasks and inspired by this the… ▽ More Hashing is one of the most efficient techniques for approximate nearest neighbour search for large scale image retrieval. Most of the techniques are based on hand-engineered features and do not give optimal results all the time. Deep Convolutional Neural Networks have proven to generate very effective representation of images that are used for various computer vision tasks and inspired by this there have been several Deep Hashing models like Wang et al. (2016) have been proposed. These models train on the triplet loss function which can be used to train models with superior representation capabilities. Taking the latest advancements in training using the triplet loss I propose new techniques that help the Deep Hash-ing models train more faster and efficiently. Experiment result1show that using the more efficient techniques for training on the triplet loss, we have obtained a 5%percent improvement in our model compared to the original work of Wang et al.(2016). Using a larger model and more training data we can drastically improve the performance using the techniques we propose △ Less

Submitted 17 December, 2019; originally announced December 2019.

arXiv:1908.01940 [pdf, other]

Restoration of Non-rigidly Distorted Underwater Images using a Combination of Compressive Sensing and Local Polynomial Image Representations

Authors: Jerin Geo James, Pranay Agrawal, Ajit Rajwade

Abstract: Images of static scenes submerged beneath a wavy water surface exhibit severe non-rigid distortions. The physics of water flow suggests that water surfaces possess spatio-temporal smoothness and temporal periodicity. Hence they possess a sparse representation in the 3D discrete Fourier (DFT) basis. Motivated by this, we pose the task of restoration of such video sequences as a compressed sensing (… ▽ More Images of static scenes submerged beneath a wavy water surface exhibit severe non-rigid distortions. The physics of water flow suggests that water surfaces possess spatio-temporal smoothness and temporal periodicity. Hence they possess a sparse representation in the 3D discrete Fourier (DFT) basis. Motivated by this, we pose the task of restoration of such video sequences as a compressed sensing (CS) problem. We begin by tracking a few salient feature points across the frames of a video sequence of the submerged scene. Using these point trajectories, we show that the motion fields at all other (non-tracked) points can be effectively estimated using a typical CS solver. This by itself is a novel contribution in the field of non-rigid motion estimation. We show that this method outperforms state of the art algorithms for underwater image restoration. We further consider a simple optical flow algorithm based on local polynomial expansion of the image frames (PEOF). Surprisingly, we demonstrate that PEOF is more efficient and often outperforms all the state of the art methods in terms of numerical measures. Finally, we demonstrate that a two-stage approach consisting of the CS step followed by PEOF much more accurately preserves the image structure and improves the (visual as well as numerical) video quality as compared to just the PEOF stage. △ Less

Submitted 5 August, 2019; originally announced August 2019.

Comments: Accepted in ICCV 2019 for oral presentation

Journal ref: ICCV 2019

arXiv:1907.07535 [pdf, other]

Tactile Model O: Fabrication and testing of a 3d-printed, three-fingered tactile robot hand

Authors: Jasper W. James, Alex Church, Luke Cramphorn, Nathan F. Lepora

Abstract: Bringing tactile sensation to robotic hands will allow for more effective grasping, along with the wide range of benefits of human-like touch. Here we present a 3D-printed, three-fingered tactile robot hand comprising an OpenHand Model O customized to house a TacTip soft biomimetic tactile sensor in the distal phalanx of each finger. We expect that combining the grasping capabilities of this under… ▽ More Bringing tactile sensation to robotic hands will allow for more effective grasping, along with the wide range of benefits of human-like touch. Here we present a 3D-printed, three-fingered tactile robot hand comprising an OpenHand Model O customized to house a TacTip soft biomimetic tactile sensor in the distal phalanx of each finger. We expect that combining the grasping capabilities of this underactuated hand with sophisticated tactile sensing will result in an effective platform for robot hand research -- the Tactile Model O (T-MO). The design uses three JeVois machine vision systems, each comprising a miniature camera in the tactile fingertip with a processing module in the base of the hand. To evaluate the capabilities of the T-MO, we benchmark its grasping performance using the Gripper Assessment Benchmark on the YCB object set. Tactile sensing capabilities are evaluated by performing tactile object classification on 26 objects and predicting whether a grasp will successfully lift each object. Results are consistent with the state of the art, taking advantage of advances in deep learning applied to tactile image outputs. Overall, this work demonstrates that the T-MO is an effective platform for robot hand research and we expect it to open-up a range of applications in autonomous object handling. Supplemental video: https://youtu.be/RTcCpgffCrQ. △ Less

Submitted 14 August, 2020; v1 submitted 17 July, 2019; originally announced July 2019.

Comments: 15 pages, 10 figures, 3 tables

arXiv:1903.03275 [pdf, other]

Below Horizon Aircraft Detection Using Deep Learning for Vision-Based Sense and Avoid

Authors: Jasmin James, Jason J. Ford, Timothy L. Molloy

Abstract: Commercial operation of unmanned aerial vehicles (UAVs) would benefit from an onboard ability to sense and avoid (SAA) potential mid-air collision threats. In this paper we present a new approach for detection of aircraft below the horizon. We address some of the challenges faced by existing vision-based SAA methods such as detecting stationary aircraft (that have no relative motion to the backgro… ▽ More Commercial operation of unmanned aerial vehicles (UAVs) would benefit from an onboard ability to sense and avoid (SAA) potential mid-air collision threats. In this paper we present a new approach for detection of aircraft below the horizon. We address some of the challenges faced by existing vision-based SAA methods such as detecting stationary aircraft (that have no relative motion to the background), rejecting moving ground vehicles, and simultaneous detection of multiple aircraft. We propose a multi-stage, vision-based aircraft detection system which utilises deep learning to produce candidate aircraft that we track over time. We evaluate the performance of our proposed system on real flight data where we demonstrate detection ranges comparable to the state of the art with the additional capability of detecting stationary aircraft, rejecting moving ground vehicles, and tracking multiple aircraft. △ Less

Submitted 7 March, 2019; originally announced March 2019.

arXiv:1803.08205 [pdf]

doi 10.7236/JIIBC.2017.17.2.7

Update Thresholds of More Accurate Time Stamp for Event Reconstruction

Authors: Joshua I. James, Yunsik Jang

Abstract: Many systems rely on reliable timestamps to determine the time of a particular action or event. This is especially true in digital investigations where investigators are attempting to determine when a suspect actually committed an action. The challenge, however, is that objects are not updated at the exact moment that an event occurs, but within some time-span after the actual event. In this work… ▽ More Many systems rely on reliable timestamps to determine the time of a particular action or event. This is especially true in digital investigations where investigators are attempting to determine when a suspect actually committed an action. The challenge, however, is that objects are not updated at the exact moment that an event occurs, but within some time-span after the actual event. In this work we define a simple model of digital systems with objects that have associated timestamps. The model is used to predict object update patterns for objects with associated timestamps, and make predictions about these update time-spans. Through empirical studies of digital systems, we show that timestamp update patterns are not instantaneous. We then provide a method for calculating the distribution of timestamp updates on a particular system to determine more accurate action instance times. △ Less

Submitted 21 March, 2018; originally announced March 2018.

Comments: 13 pages, 7 figures

Journal ref: James, J. I., & Jang, Y. (2017). Update Thresholds of More Accurate Time Stamp for Event Reconstruction. The Journal of the Institute of Internet Broadcasting and Communication, 17(2), 7-13. https://doi.org/10.7236/JIIBC.2017.17.2.7

arXiv:1711.04502 [pdf]

United Nations Digital Blue Helmets as a Starting Point for Cyber Peacekeeping

Authors: Nikolay Akatyev, Joshua I. James

Abstract: Prior works, such as the Tallinn manual on the international law applicable to cyber warfare, focus on the circumstances of cyber warfare. Many organizations are considering how to conduct cyber warfare, but few have discussed methods to reduce, or even prevent, cyber conflict. A recent series of publications started developing the framework of Cyber Peacekeeping (CPK) and its legal requirements.… ▽ More Prior works, such as the Tallinn manual on the international law applicable to cyber warfare, focus on the circumstances of cyber warfare. Many organizations are considering how to conduct cyber warfare, but few have discussed methods to reduce, or even prevent, cyber conflict. A recent series of publications started developing the framework of Cyber Peacekeeping (CPK) and its legal requirements. These works assessed the current state of organizations such as ITU IMPACT, NATO CCDCOE and Shanghai Cooperation Organization, and found that they did not satisfy requirements to effectively host CPK activities. An assessment of organizations currently working in the areas related to CPK found that the United Nations (UN) has mandates and organizational structures that appear to somewhat overlap the needs of CPK. However, the UN's current approach to Peacekeeping cannot be directly mapped to cyberspace. In this research we analyze the development of traditional Peacekeeping in the United Nations, and current initiatives in cyberspace. Specifically, we will compare the proposed CPK framework with the recent initiative of the United Nations named the 'Digital Blue Helmets' as well as with other projects in the UN which helps to predict and mitigate conflicts. Our goal is to find practical recommendations for the implementation of the CPK framework in the United Nations, and to examine how responsibilities defined in the CPK framework overlap with those of the 'Digital Blue Helmets' and the Global Pulse program. △ Less

Submitted 13 November, 2017; originally announced November 2017.

Journal ref: European Conference on Information Warfare and Security, ECCWS. p.8-16 (2017)

arXiv:1711.04500 [pdf]

A Case Study of the 2016 Korean Cyber Command Compromise

Authors: Kyong Jae Park, Sung Mi Park, Joshua I. James

Abstract: On October 2016 the South Korean cyber military unit was the victim of a successful cyber attack that allowed access to internal networks. Per usual with large scale attacks against South Korean entities, the hack was immediately attributed to North Korea. Also, per other large-scale cyber security incidents, the same types of 'evidence' were used for attribution purposes. Disclosed methods of att… ▽ More On October 2016 the South Korean cyber military unit was the victim of a successful cyber attack that allowed access to internal networks. Per usual with large scale attacks against South Korean entities, the hack was immediately attributed to North Korea. Also, per other large-scale cyber security incidents, the same types of 'evidence' were used for attribution purposes. Disclosed methods of attribution provide weak evidence, and the procedure Korean organizations tend to use for information disclosure lead many to question any conclusions. We will analyze and discuss a number of issues with the current way that South Korean organizations disclose cyber attack information to the public. A time line of events and disclosures will be constructed and analyzed in the context of appropriate measures for cyber warfare. Finally, we will examine the South Korean cyber military attack in terms previously proposed cyber warfare response guidelines. Specifically, whether any of the guidelines can be applied to this real-world case, and if so, is South Korea justified in declaring war based on the most recent cyber attack. △ Less

Submitted 13 November, 2017; originally announced November 2017.

Journal ref: European Conference on Information Warfare and Security, ECCWS. p.315-321 (2017)

arXiv:1502.05191 [pdf, other]

doi 10.1007/978-3-319-14289-0_15

Determining Training Needs for Cloud Infrastructure Investigations using I-STRIDE

Authors: Joshua I. James, Ahmed F. Shosha, Pavel Gladyshev

Abstract: As more businesses and users adopt cloud computing services, security vulnerabilities will be increasingly found and exploited. There are many technological and political challenges where investigation of potentially criminal incidents in the cloud are concerned. Security experts, however, must still be able to acquire and analyze data in a methodical, rigorous and forensically sound manner. This… ▽ More As more businesses and users adopt cloud computing services, security vulnerabilities will be increasingly found and exploited. There are many technological and political challenges where investigation of potentially criminal incidents in the cloud are concerned. Security experts, however, must still be able to acquire and analyze data in a methodical, rigorous and forensically sound manner. This work applies the STRIDE asset-based risk assessment method to cloud computing infrastructure for the purpose of identifying and assessing an organization's ability to respond to and investigate breaches in cloud computing environments. An extension to the STRIDE risk assessment model is proposed to help organizations quickly respond to incidents while ensuring acquisition and integrity of the largest amount of digital evidence possible. Further, the proposed model allows organizations to assess the needs and capacity of their incident responders before an incident occurs. △ Less

Submitted 18 February, 2015; originally announced February 2015.

Comments: 13 pages, 3 figures, 3 tables, 5th International Conference on Digital Forensics and Cyber Crime; Digital Forensics and Cyber Crime, pp. 223-236, 2014

arXiv:1502.05186 [pdf, ps, other]

doi 10.1007/978-3-319-14289-0_11

Measuring Accuracy of Automated Parsing and Categorization Tools and Processes in Digital Investigations

Authors: Joshua I. James, Alejandra Lopez-Fernandez, Pavel Gladyshev

Abstract: This work presents a method for the measurement of the accuracy of evidential artifact extraction and categorization tasks in digital forensic investigations. Instead of focusing on the measurement of accuracy and errors in the functions of digital forensic tools, this work proposes the application of information retrieval measurement techniques that allow the incorporation of errors introduced by… ▽ More This work presents a method for the measurement of the accuracy of evidential artifact extraction and categorization tasks in digital forensic investigations. Instead of focusing on the measurement of accuracy and errors in the functions of digital forensic tools, this work proposes the application of information retrieval measurement techniques that allow the incorporation of errors introduced by tools and analysis processes. This method uses a `gold standard' that is the collection of evidential objects determined by a digital investigator from suspect data with an unknown ground truth. This work proposes that the accuracy of tools and investigation processes can be evaluated compared to the derived gold standard using common precision and recall values. Two example case studies are presented showing the measurement of the accuracy of automated analysis tools as compared to an in-depth analysis by an expert. It is shown that such measurement can allow investigators to determine changes in accuracy of their processes over time, and determine if such a change is caused by their tools or knowledge. △ Less

Submitted 18 February, 2015; originally announced February 2015.

Comments: 17 pages, 2 appendices, 1 figure, 5th International Conference on Digital Forensics and Cyber Crime; Digital Forensics and Cyber Crime, pp. 147-169, 2014

arXiv:1502.01133 [pdf]

doi 10.7236/JIIBC.2014.14.6.33

Practical and Legal Challenges of Cloud Investigations

Authors: Joshua I. James, Yunsik Jang

Abstract: An area presenting new opportunities for both legitimate business, as well as criminal organizations, is Cloud computing. This work gives a strong background in current digital forensic science, as well as a basic understanding of the goal of Law Enforcement when conducting digital forensic investigations. These concepts are then applied to digital forensic investigation of cloud environments in b… ▽ More An area presenting new opportunities for both legitimate business, as well as criminal organizations, is Cloud computing. This work gives a strong background in current digital forensic science, as well as a basic understanding of the goal of Law Enforcement when conducting digital forensic investigations. These concepts are then applied to digital forensic investigation of cloud environments in both theory and practice, and supplemented with current literature on the subject. Finally, legal challenges with digital forensic investigations in cloud environments are discussed. △ Less

Submitted 4 February, 2015; originally announced February 2015.

Comments: 7 pages

ACM Class: K.4.1; K.4.2

Journal ref: The Journal of The Institute of Internet, Broadcasting and Communication, 14(6), 33-39, 2014

arXiv:1407.5714 [pdf, other]

doi 10.1007/s10207-014-0249-6

Automated Inference of Past Action Instances in Digital Investigations

Authors: Joshua I. James, Pavel Gladyshev

Abstract: As the amount of digital devices suspected of containing digital evidence increases, case backlogs for digital investigations are also increasing in many organizations. To ensure timely investigation of requests, this work proposes the use of signature-based methods for automated action instance approximation to automatically reconstruct past user activities within a compromised or suspect system.… ▽ More As the amount of digital devices suspected of containing digital evidence increases, case backlogs for digital investigations are also increasing in many organizations. To ensure timely investigation of requests, this work proposes the use of signature-based methods for automated action instance approximation to automatically reconstruct past user activities within a compromised or suspect system. This work specifically explores how multiple instances of a user action may be detected using signature-based methods during a post-mortem digital forensic analysis. A system is formally defined as a set of objects, where a subset of objects may be altered on the occurrence of an action. A novel action-trace update time threshold is proposed that enables objects to be categorized by their respective update patterns over time. By integrating time into event reconstruction, the most recent action instance approximation as well as limited past instances of the action may be differentiated and their time values approximated. After the formal theory if signature-based event reconstruction is defined, a case study is given to evaluate the practicality of the proposed method. △ Less

Submitted 21 July, 2014; originally announced July 2014.

Comments: International Journal of Information Security

arXiv:1308.6363 [pdf, ps, other]

doi 10.1007/978-3-642-40861-8_51

Measuring digital crime investigation capacity to guide international crime prevention strategies

Authors: Joshua I. James, Yunsik Jake Jang

Abstract: This work proposes a method for the measurement of a country's digital investigation capacity and saturation for the assessment of future capacity expansion. The focus is on external, or international, partners being a factor that could negatively affect the return on investment when attempting to expand investigation capacity nationally. This work concludes with the argument that when dealing wit… ▽ More This work proposes a method for the measurement of a country's digital investigation capacity and saturation for the assessment of future capacity expansion. The focus is on external, or international, partners being a factor that could negatively affect the return on investment when attempting to expand investigation capacity nationally. This work concludes with the argument that when dealing with digital crime, target international partners should be a consideration in expansion, and could potentially be a bottleneck of investigation requests. △ Less

Submitted 29 August, 2013; originally announced August 2013.

Comments: 7 pages, 3 figures, Presented at FutureTech 2013

Journal ref: Future Information Technology. Springer Berlin Heidelberg, 2014. 361-366

arXiv:1307.0076 [pdf, ps, other]

An Assessment Model for Cybercrime Investigation Capacity

Authors: Joshua I. James, Yunsik Jake Jang

Abstract: Digital technologies are constantly changing, and with it criminals are finding new ways to abuse these technologies. Cybercrime investigators, then, must also keep their skills and knowledge up to date. This work proposes a holistic training development model - specifically focused on cybercrime investigation - that is based on improving investigator capability while also considering the capacity… ▽ More Digital technologies are constantly changing, and with it criminals are finding new ways to abuse these technologies. Cybercrime investigators, then, must also keep their skills and knowledge up to date. This work proposes a holistic training development model - specifically focused on cybercrime investigation - that is based on improving investigator capability while also considering the capacity of the investigator or unit. Along with a training development model, a cybercrime investigation capacity assessment framework is given for attempting to measure capacity throughout the education process. First, a training development model is proposed that focuses on the expansion of investigation capability as well as capacity of investigators and units. Next, a capacity assessment model is given to evaluate the effectiveness of the training program. A description of how the proposed model is being applied to the development of training programs for cybercrime investigators in developing countries will then be given, as well as already observed challenges. Finally, concluding remarks as well as proposed future work is discussed. △ Less

Submitted 29 June, 2013; originally announced July 2013.

Comments: 1 figure, World Crime Forum 1st Asian Regional Conference - Information Society and Cybercrime: Challenges for Criminology and Criminal Justice

arXiv:1303.4498 [pdf]

Challenges with Automation in Digital Forensic Investigations

Authors: Joshua I. James, Pavel Gladyshev

Abstract: The use of automation in digital forensic investigations is not only a technological issue, but also has political and social implications. This work discusses some challenges with the implementation and acceptance of automation in digital forensic investigation, and possible implications for current digital forensic investigators. Current attitudes towards the use of automation in digital forensi… ▽ More The use of automation in digital forensic investigations is not only a technological issue, but also has political and social implications. This work discusses some challenges with the implementation and acceptance of automation in digital forensic investigation, and possible implications for current digital forensic investigators. Current attitudes towards the use of automation in digital forensic investigations are examined, as well as the issue of digital investigators knowledge acquisition and retention. The argument is made for a well planned, careful use of automation going forward that allows for a more efficient and effective use of automation in digital forensic investigations while at the same time attempting to improve the overall quality of expert investigators. Targeting and carefully controlling automated solutions for beginning investigators may improve the speed and quality of investigations while at the same time letting expert digital investigators spend more time utilizing expert level knowledge required in manual phases of investigations. By considering how automated solutions are being implemented into digital investigations, investigation unit managers can increase the efficiency of their unit while at the same time maximizing their return on investment for expert level digital investigator training. △ Less

Submitted 19 March, 2013; originally announced March 2013.

Comments: 17 pages, 1 figure

arXiv:1302.2395 [pdf]

doi 10.1007/978-3-642-19513-6_8

Signature Based Detection of User Events for Post-Mortem Forensic Analysis

Authors: Joshua I. James, Pavel Gladyshev, Yuandong Zhu

Abstract: This paper introduces a novel approach to user event reconstruction by showing the practicality of generating and implementing signature-based analysis methods to reconstruct high-level user actions from a collection of low-level traces found during a post-mortem forensic analysis of a system. Traditional forensic analysis and the inferences an investigator normally makes when given digital eviden… ▽ More This paper introduces a novel approach to user event reconstruction by showing the practicality of generating and implementing signature-based analysis methods to reconstruct high-level user actions from a collection of low-level traces found during a post-mortem forensic analysis of a system. Traditional forensic analysis and the inferences an investigator normally makes when given digital evidence, are examined. It is then demonstrated that this natural process of inferring high-level events from low-level traces may be encoded using signature-matching techniques. Simple signatures using the defined method are created and applied for three popular Windows-based programs as a proof of concept. △ Less

Submitted 10 February, 2013; originally announced February 2013.

Comments: 15 pages, 4 figures, 5 tables, 1 appendix, 2nd International Conference on Digital Forensics and Cyber Crime

Journal ref: James, J.I., P. Gladyshev, Y. Zhu. (2011) "Signature Based Detection of User Events for Post-Mortem Forensic Analysis". Digital Forensics and Cyber Crime. Vol 53. pp 96-109. Springer

arXiv:1302.2308 [pdf]

doi 10.1007/978-3-642-11534-9_9

Analysis of Evidence Using Formal Event Reconstruction

Authors: Joshua I. James, Pavel Gladyshev, Mohd Taufik Abdullah, Yuandong Zhu

Abstract: This paper expands upon the finite state machine approach for the formal analysis of digital evidence. The proposed method may be used to support the feasibility of a given statement by testing it against a relevant system model. To achieve this, a novel method for modeling the system and evidential statements is given. The method is then examined in a case study example. This paper expands upon the finite state machine approach for the formal analysis of digital evidence. The proposed method may be used to support the feasibility of a given statement by testing it against a relevant system model. To achieve this, a novel method for modeling the system and evidential statements is given. The method is then examined in a case study example. △ Less

Submitted 10 February, 2013; originally announced February 2013.

Comments: 10 pages, 11 figures, Presented at the 1st International Conference on Digital Forensics & Cyber Crime

Journal ref: James, J.I., P. Gladyshev, M. Abdullah, Y. Zhu (2010) "Analysis of Evidence Using Formal Event Reconstruction". Digital Forensics and Cyber Crime. Vol 31. pp 85-98. Springer

Showing 1–44 of 44 results for author: James, J