subscribe to arXiv mailings

AI for All: Identifying AI incidents Related to Diversity and Inclusion

Authors: Rifat Ara Shams, Didar Zowghi, Muneera Bano

Abstract: The rapid expansion of Artificial Intelligence (AI) technologies has introduced both significant advancements and challenges, with diversity and inclusion (D&I) emerging as a critical concern. Addressing D&I in AI is essential to reduce biases and discrimination, enhance fairness, and prevent adverse societal impacts. Despite its importance, D&I considerations are often overlooked, resulting in in… ▽ More The rapid expansion of Artificial Intelligence (AI) technologies has introduced both significant advancements and challenges, with diversity and inclusion (D&I) emerging as a critical concern. Addressing D&I in AI is essential to reduce biases and discrimination, enhance fairness, and prevent adverse societal impacts. Despite its importance, D&I considerations are often overlooked, resulting in incidents marked by built-in biases and ethical dilemmas. Analyzing AI incidents through a D&I lens is crucial for identifying causes of biases and developing strategies to mitigate them, ensuring fairer and more equitable AI technologies. However, systematic investigations of D&I-related AI incidents are scarce. This study addresses these challenges by identifying and understanding D&I issues within AI systems through a manual analysis of AI incident databases (AIID and AIAAIC). The research develops a decision tree to investigate D&I issues tied to AI incidents and populate a public repository of D&I-related AI incidents. The decision tree was validated through a card sorting exercise and focus group discussions. The research demonstrates that almost half of the analyzed AI incidents are related to D&I, with a notable predominance of racial, gender, and age discrimination. The decision tree and resulting public repository aim to foster further research and responsible AI practices, promoting the development of inclusive and equitable AI systems. △ Less

Submitted 19 July, 2024; originally announced August 2024.

Comments: 25 pages, 9 figures, 2 tables

ACM Class: I.2.0

arXiv:2311.14695 [pdf]

AI for All: Operationalising Diversity and Inclusion Requirements for AI Systems

Authors: Muneera Bano, Didar Zowghi, Vincenzo Gervasi, Rifat Shams

Abstract: As Artificial Intelligence (AI) permeates many aspects of society, it brings numerous advantages while at the same time raising ethical concerns and potential risks, such as perpetuating inequalities through biased or discriminatory decision-making. To develop AI systems that cater for the needs of diverse users and uphold ethical values, it is essential to consider and integrate diversity and inc… ▽ More As Artificial Intelligence (AI) permeates many aspects of society, it brings numerous advantages while at the same time raising ethical concerns and potential risks, such as perpetuating inequalities through biased or discriminatory decision-making. To develop AI systems that cater for the needs of diverse users and uphold ethical values, it is essential to consider and integrate diversity and inclusion (D&I) principles throughout AI development and deployment. Requirements engineering (RE) is a fundamental process in developing software systems by eliciting and specifying relevant needs from diverse stakeholders. This research aims to address the lack of research and practice on how to elicit and capture D&I requirements for AI systems. We have conducted comprehensive data collection and synthesis from the literature review to extract requirements themes related to D&I in AI. We have proposed a tailored user story template to capture D&I requirements and conducted focus group exercises to use the themes and user story template in writing D&I requirements for two example AI systems. Additionally, we have investigated the capability of our solution by generating synthetic D&I requirements captured in user stories with the help of a Large Language Model. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: 10 pages, 5 figures

arXiv:2307.10600 [pdf, other]

Challenges and Solutions in AI for All

Authors: Rifat Ara Shams, Didar Zowghi, Muneera Bano

Abstract: Artificial Intelligence (AI)'s pervasive presence and variety necessitate diversity and inclusivity (D&I) principles in its design for fairness, trust, and transparency. Yet, these considerations are often overlooked, leading to issues of bias, discrimination, and perceived untrustworthiness. In response, we conducted a Systematic Review to unearth challenges and solutions relating to D&I in AI. O… ▽ More Artificial Intelligence (AI)'s pervasive presence and variety necessitate diversity and inclusivity (D&I) principles in its design for fairness, trust, and transparency. Yet, these considerations are often overlooked, leading to issues of bias, discrimination, and perceived untrustworthiness. In response, we conducted a Systematic Review to unearth challenges and solutions relating to D&I in AI. Our rigorous search yielded 48 research articles published between 2017 and 2022. Open coding of these papers revealed 55 unique challenges and 33 solutions for D&I in AI, as well as 24 unique challenges and 23 solutions for enhancing such practices using AI. This study, by offering a deeper understanding of these issues, will enlighten researchers and practitioners seeking to integrate these principles into future AI systems. △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: 39 pages, 10 figures, 10 tables

MSC Class: I; I.2

arXiv:2203.10382 [pdf, other]

Investigating End-Users' Values in Agriculture Mobile Applications Development: An Empirical Study on Bangladeshi Female Farmers

Authors: Rifat Ara Shams, Mojtaba Shahin, Gillian Oliver, Harsha Perera, Jon Whittle, Arif Nurwidyantoro, Waqar Hussain

Abstract: The omnipresent nature of mobile applications (apps) in all aspects of daily lives raises the necessity of reflecting end-users values (e.g., fairness, honesty, etc.) in apps. However, there are limited considerations of end-users values in apps development. Value violations by apps have been reported in the media and are responsible for end-users dissatisfaction and negative socio-economic conseq… ▽ More The omnipresent nature of mobile applications (apps) in all aspects of daily lives raises the necessity of reflecting end-users values (e.g., fairness, honesty, etc.) in apps. However, there are limited considerations of end-users values in apps development. Value violations by apps have been reported in the media and are responsible for end-users dissatisfaction and negative socio-economic consequences. Value violations may bring more severe and lasting problems for marginalized and vulnerable end-users of apps, which have been explored less (if at all) in the software engineering community. However, understanding the values of the end-users of apps is the essential first step towards values-based apps development. This research aims to fill this gap by investigating the human values of Bangladeshi female farmers as a marginalized and vulnerable group of end-users of Bangladeshi agriculture apps. We conducted an empirical study that collected and analyzed data from a survey with 193 Bangladeshi female farmers to explore the underlying factor structure of the values of Bangladeshi female farmers and the significance of demographics on their values. The results identified three underlying factors of Bangladeshi female farmers. The first factor comprises of five values: benevolence, security, conformity, universalism, and tradition. The second factor consists of two values: self-direction and stimulation. The third factor includes three values: power, achievement, and hedonism. We also identified strong influences of demographics on some of the values of Bangladeshi female farmers. For example, area has significant impacts on three values: hedonism, achievement, and tradition. Similarly, there are also strong influences of household income on power and security. △ Less

Submitted 19 March, 2022; originally announced March 2022.

Comments: 44 pages, 7 figures, 8 tables, Journal of Systems and Software

arXiv:2111.15293 [pdf, other]

The Impact of Considering Human Values during Requirements Engineering Activities

Authors: Harsha Perera, Rashina Hoda, Rifat Ara Shams, Arif Nurwidyantoro, Mojtaba Shahin, Waqar Hussain, Jon Whittle

Abstract: Human values, or what people hold important in their life, such as freedom, fairness, and social responsibility, often remain unnoticed and unattended during software development. Ignoring values can lead to values violations in software that can result in financial losses, reputation damage, and widespread social and legal implications. However, embedding human values in software is not only non-… ▽ More Human values, or what people hold important in their life, such as freedom, fairness, and social responsibility, often remain unnoticed and unattended during software development. Ignoring values can lead to values violations in software that can result in financial losses, reputation damage, and widespread social and legal implications. However, embedding human values in software is not only non-trivial but also generally an unclear process. Commencing as early as during the Requirements Engineering (RE) activities promises to ensure fit-for-purpose and quality software products that adhere to human values. But what is the impact of considering human values explicitly during early RE activities? To answer this question, we conducted a scenario-based survey where 56 software practitioners contextualised requirements analysis towards a proposed mobile application for the homeless and suggested values-laden software features accordingly. The suggested features were qualitatively analysed. Results show that explicit considerations of values can help practitioners identify applicable values, associate purpose with the features they develop, think outside-the-box, and build connections between software features and human values. Finally, drawing from the results and experiences of this study, we propose a scenario-based values elicitation process -- a simple four-step takeaway as a practical implication of this study. △ Less

Submitted 30 November, 2021; originally announced November 2021.

Comments: 17 pages, 8 images, 5 tables

arXiv:2110.05150 [pdf, other]

Human Values in Mobile App Development: An Empirical Study on Bangladeshi Agriculture Mobile Apps

Authors: Rifat Ara Shams, Mojtaba Shahin, Gillian Oliver, Jon Whittle, Waqar Hussain, Harsha Perera, Arif Nurwidyantoro

Abstract: Given the ubiquity of mobile applications (apps) in daily lives, understanding and reflecting end-users' human values (e.g., transparency, privacy, social recognition etc.) in apps has become increasingly important. Violations of end users' values by software applications have been reported in the media and have resulted in a wide range of difficulties for end users. Value violations may bring mor… ▽ More Given the ubiquity of mobile applications (apps) in daily lives, understanding and reflecting end-users' human values (e.g., transparency, privacy, social recognition etc.) in apps has become increasingly important. Violations of end users' values by software applications have been reported in the media and have resulted in a wide range of difficulties for end users. Value violations may bring more and lasting problems for marginalized and vulnerable groups of end-users. This research aims to understand the extent to which the values of Bangladeshi female farmers, marginalized and vulnerable end-users, who are less studied by the software engineering community, are reflected in agriculture apps in Bangladesh. Further to this, we aim to identify possible strategies to embed their values in those apps. To this end, we conducted a mixed-methods empirical study consisting of 13 interviews with app practitioners and four focus groups with 20 Bangladeshi female farmers. The accumulated results from the interviews and focus groups identified 22 values of Bangladeshi female farmers, which the participants expect to be reflected in the agriculture apps. Among these 22 values, 15 values (e.g., accuracy, independence) are already reflected and 7 values (e.g., accessibility, pleasure) are ignored/violated in the existing agriculture apps. We also identified 14 strategies (e.g., "applying human-centered approaches to elicit values", "establishing a dedicated team/person for values concerns") to address Bangladeshi female farmers' values in agriculture apps. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Comments: 18 pages, 6 figures, Manuscript submitted to IEEE Transactions on Software Engineering (2021)

arXiv:2108.05624 [pdf, other]

doi 10.1109/ACCESS.2022.3190975

Operationalizing Human Values in Software Engineering: A Survey

Authors: Mojtaba Shahin, Waqar Hussain, Arif Nurwidyantoro, Harsha Perera, Rifat Shams, John Grundy, Jon Whittle

Abstract: Human values (e.g., pleasure, privacy, and social justice) are what a person or a society considers important. The inability to address them in software-intensive systems can result in numerous undesired consequences (e.g., financial losses) for individuals and communities. Various solutions (e.g., methodologies, techniques) are developed to help "operationalize values in software". The ultimate g… ▽ More Human values (e.g., pleasure, privacy, and social justice) are what a person or a society considers important. The inability to address them in software-intensive systems can result in numerous undesired consequences (e.g., financial losses) for individuals and communities. Various solutions (e.g., methodologies, techniques) are developed to help "operationalize values in software". The ultimate goal is to ensure building software (better) reflects and respects human values. In this survey, "operationalizing values" is referred to as the process of identifying human values and translating them to accessible and concrete concepts so that they can be implemented, validated, verified, and measured in software. This paper provides a deep understanding of the research landscape on operationalizing values in software engineering, covering 51 primary studies. It also presents an analysis and taxonomy of 51 solutions for operationalizing values in software engineering. Our survey reveals that most solutions attempt to help operationalize values in the early phases (requirements and design) of the software development life cycle. However, the later phases (implementation and testing) and other aspects of software development (e.g., "team organization") still need adequate consideration. We outline implications for research and practice and identify open issues and future research directions to advance this area. △ Less

Submitted 25 July, 2022; v1 submitted 12 August, 2021; originally announced August 2021.

Comments: Accepted for publication in IEEE Access Journal, IEEE - 27 Pages - 14 Tables, 7 Figures

arXiv:2107.11273 [pdf, other]

Towards a Human Values Dashboard for Software Development: An Exploratory Study

Authors: Arif Nurwidyantoro, Mojtaba Shahin, Michel Chaudron, Waqar Hussain, Harsha Perera, Rifat Ara Shams, Jon Whittle

Abstract: Background: There is a growing awareness of the importance of human values (e.g., inclusiveness, privacy) in software systems. However, there are no practical tools to support the integration of human values during software development. We argue that a tool that can identify human values from software development artefacts and present them to varying software development roles can (partially) addr… ▽ More Background: There is a growing awareness of the importance of human values (e.g., inclusiveness, privacy) in software systems. However, there are no practical tools to support the integration of human values during software development. We argue that a tool that can identify human values from software development artefacts and present them to varying software development roles can (partially) address this gap. We refer to such a tool as human values dashboard. Further to this, our understanding of such a tool is limited. Aims: This study aims to (1) investigate the possibility of using a human values dashboard to help address human values during software development, (2) identify possible benefits of using a human values dashboard, and (3) elicit practitioners' needs from a human values dashboard. Method: We conducted an exploratory study by interviewing 15 software practitioners. A dashboard prototype was developed to support the interview process. We applied thematic analysis to analyse the collected data. Results: Our study finds that a human values dashboard would be useful for the development team (e.g., project manager, developer, tester). Our participants acknowledge that development artefacts, especially requirements documents and issue discussions, are the most suitable source for identifying values for the dashboard. Our study also yields a set of high-level user requirements for a human values dashboard (e.g., it shall allow determining values priority of a project). Conclusions: Our study suggests that a values dashboard is potentially used to raise awareness of values and support values-based decision-making in software development. Future work will focus on addressing the requirements and using issue discussions as potential artefacts for the dashboard. △ Less

Submitted 23 July, 2021; originally announced July 2021.

Comments: 12 Pages. Accepted to appear in 15th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). Preprint

arXiv:2102.12107 [pdf, other]

How Can Human Values Be Addressed in Agile Methods? A Case Study on SAFe

Authors: Waqar Hussain, Mojtaba Shahin, Rashina Hoda, Jon Whittle, Harsha Perera, Arif Nurwidyantoro, Rifat Ara Shams, Gillian Oliver

Abstract: Agile methods are predominantly focused on delivering business values. But can Agile methods be adapted to effectively address and deliver human values such as social justice, privacy, and sustainability in the software they produce? Human values are what an individual or a society considers important in life. Ignoring these human values in software can pose difficulties or risks for all stakehold… ▽ More Agile methods are predominantly focused on delivering business values. But can Agile methods be adapted to effectively address and deliver human values such as social justice, privacy, and sustainability in the software they produce? Human values are what an individual or a society considers important in life. Ignoring these human values in software can pose difficulties or risks for all stakeholders (e.g., user dissatisfaction, reputation damage, financial loss). To answer this question, we selected the Scaled Agile Framework (SAFe), one of the most commonly used Agile methods in the industry, and conducted a qualitative case study to identify possible intervention points within SAFe that are the most natural to address and integrate human values in software. We present five high-level empirically-justified sets of interventions in SAFe: artefacts, roles, ceremonies, practices, and culture. We elaborate how some current Agile artefacts (e.g., user story), roles (e.g., product owner), ceremonies (e.g., stand-up meeting), and practices (e.g., business-facing testing) in SAFe can be modified to support the inclusion of human values in software. Further, our study suggests new and exclusive values-based artefacts (e.g., legislative requirement), ceremonies (e.g., values conversation), roles (e.g., values champion), and cultural practices (e.g., induction and hiring) to be introduced in SAFe for this purpose. Guided by our findings, we argue that existing Agile methods can account for human values in software delivery with some evolutionary adaptations. △ Less

Submitted 12 November, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

Comments: Preprint - Accepted to be published in IEEE Transactions on Software Engineering (2021), 18 Pages, 5 Figures, 3 Tables

arXiv:2012.01268 [pdf]

Measuring Bangladeshi Female Farmers' Values for Agriculture Mobile Applications Development

Authors: Rifat Ara Shams, Mojtaba Shahin, Gillian Oliver, Waqar Hussain, Harsha Perera, Arif Nurwidyantoro, Jon Whittle

Abstract: The ubiquity of mobile applications (apps) in daily life raises the imperative that the apps should reflect users' values. However, users' values are not usually taken into account in app development. Thus there is significant potential for user dissatisfaction and negative socio-economic consequences. To be cognizant of values in apps, the first step is to find out what those values are, and that… ▽ More The ubiquity of mobile applications (apps) in daily life raises the imperative that the apps should reflect users' values. However, users' values are not usually taken into account in app development. Thus there is significant potential for user dissatisfaction and negative socio-economic consequences. To be cognizant of values in apps, the first step is to find out what those values are, and that was the objective of this study conducted in Bangladesh. Our focus was on rural women, specifically female farmers. The basis for our study was Schwartz's universal human values theory, and we used an associated survey instrument, the Portrait Values Questionnaire (PVQ). Our survey of 193 Bangladeshi female farmers showed that Conformity and Security were regarded as the most important values, while Power, Hedonism, and Stimulation were the least important. This finding would be helpful for developers to take into account when developing agriculture apps for this market. In addition, the methodology we used provides a model to follow to elicit the values of apps' users in other communities. △ Less

Submitted 22 November, 2020; originally announced December 2020.

Comments: 10 Pages, Accepted to appear in 54th Hawaii International Conference on System Sciences, 2021

arXiv:2005.03759 [pdf, other]

doi 10.1016/j.advwatres.2020.103787

DeePore: a deep learning workflow for rapid and comprehensive characterization of porous materials

Authors: Arash Rabbani, Masoud Babaei, Reza Shams, Ying Da Wang, Traiwit Chung

Abstract: DeePore is a deep learning workflow for rapid estimation of a wide range of porous material properties based on the binarized micro-tomography images. By combining naturally occurring porous textures we generated 17700 semi-real 3-D micro-structures of porous geo-materials with size of 256^3 voxels and 30 physical properties of each sample are calculated using physical simulations on the correspon… ▽ More DeePore is a deep learning workflow for rapid estimation of a wide range of porous material properties based on the binarized micro-tomography images. By combining naturally occurring porous textures we generated 17700 semi-real 3-D micro-structures of porous geo-materials with size of 256^3 voxels and 30 physical properties of each sample are calculated using physical simulations on the corresponding pore network models. Next, a designed feed-forward convolutional neural network (CNN) is trained based on the dataset to estimate several morphological, hydraulic, electrical, and mechanical characteristics of the porous material in a fraction of a second. In order to fine-tune the CNN design, we tested 9 different training scenarios and selected the one with the highest average coefficient of determination (R^2) equal to 0.885 for 1418 testing samples. Additionally, 3 independent synthetic images as well as 3 realistic tomography images have been tested using the proposed method and results are compared with pore network modelling and experimental data, respectively. Tested absolute permeabilities had around 13 % relative error compared to the experimental data which is noticeable considering the accuracy of the direct numerical simulation methods such as Lattice Boltzmann and Finite Volume. The workflow is compatible with any physical size of the images due to its dimensionless approach and can be used to characterize large-scale 3-D images by averaging the model outputs for a sliding window that scans the whole geometry. △ Less

Submitted 10 October, 2020; v1 submitted 3 May, 2020; originally announced May 2020.

Journal ref: Advances in Water Resources, 2020, 103787

arXiv:1907.07874 [pdf, other]

A Study on the Prevalence of Human Values in Software Engineering Publications, 2015-2018

Authors: Harsha Perera, Arif Nurwidyantoro, Waqar Hussain, Davoud Mougouei, Jon Whittle, Rifat Ara Shams, Gillian Oliver

Abstract: Failure to account for human values in software (e.g., equality and fairness) can result in user dissatisfaction and negative socio-economic impact. Engineering these values in software, however, requires technical and methodological support throughout the development life cycle. This paper investigates to what extent software engineering (SE) research has considered human values. We investigate t… ▽ More Failure to account for human values in software (e.g., equality and fairness) can result in user dissatisfaction and negative socio-economic impact. Engineering these values in software, however, requires technical and methodological support throughout the development life cycle. This paper investigates to what extent software engineering (SE) research has considered human values. We investigate the prevalence of human values in recent (2015 - 2018) publications at some of the top-tier SE conferences and journals. We classify SE publications, based on their relevance to different values, against a widely used value structure adopted from social sciences. Our results show that: (a) only a small proportion of the publications directly consider values, classified as relevant publications; (b) for the majority of the values, very few or no relevant publications were found; and (c) the prevalence of the relevant publications was higher in SE conferences compared to SE journals. This paper shares these and other insights that motivate research on human values in software engineering. △ Less

Submitted 18 July, 2019; originally announced July 2019.

arXiv:1904.10535 [pdf]

doi 10.1109/TMI.2019.2935060

Evaluation of MRI to ultrasound registration methods for brain shift correction: The CuRIOUS2018 Challenge

Authors: Yiming Xiao, Hassan Rivaz, Matthieu Chabanas, Maryse Fortin, Ines Machado, Yangming Ou, Mattias P. Heinrich, Julia A. Schnabel, Xia Zhong, Andreas Maier, Wolfgang Wein, Roozbeh Shams, Samuel Kadoury, David Drobny, Marc Modat, Ingerid Reinertsen

Abstract: In brain tumor surgery, the quality and safety of the procedure can be impacted by intra-operative tissue deformation, called brain shift. Brain shift can move the surgical targets and other vital structures such as blood vessels, thus invalidating the pre-surgical plan. Intra-operative ultrasound (iUS) is a convenient and cost-effective imaging tool to track brain shift and tumor resection. Accur… ▽ More In brain tumor surgery, the quality and safety of the procedure can be impacted by intra-operative tissue deformation, called brain shift. Brain shift can move the surgical targets and other vital structures such as blood vessels, thus invalidating the pre-surgical plan. Intra-operative ultrasound (iUS) is a convenient and cost-effective imaging tool to track brain shift and tumor resection. Accurate image registration techniques that update pre-surgical MRI based on iUS are crucial but challenging. The MICCAI Challenge 2018 for Correction of Brain shift with Intra-Operative UltraSound (CuRIOUS2018) provided a public platform to benchmark MRI-iUS registration algorithms on newly released clinical datasets. In this work, we present the data, setup, evaluation, and results of CuRIOUS 2018, which received 6 fully automated algorithms from leading academic and industrial research groups. All algorithms were first trained with the public RESECT database, and then ranked based on test dataset of 10 additional cases with identical data curation and annotation protocols as the RESECT database. The article compares the results of all participating teams and discusses the insights gained from the challenge, as well as future work. △ Less

Submitted 23 April, 2019; originally announced April 2019.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Journal ref: IEEE transactions on medical imaging,2019

arXiv:1409.7612 [pdf, ps, other]

Semi-supervised Classification for Natural Language Processing

Authors: Rushdi Shams

Abstract: Semi-supervised classification is an interesting idea where classification models are learned from both labeled and unlabeled data. It has several advantages over supervised classification in natural language processing domain. For instance, supervised classification exploits only labeled data that are expensive, often difficult to get, inadequate in quantity, and require human experts for annotat… ▽ More Semi-supervised classification is an interesting idea where classification models are learned from both labeled and unlabeled data. It has several advantages over supervised classification in natural language processing domain. For instance, supervised classification exploits only labeled data that are expensive, often difficult to get, inadequate in quantity, and require human experts for annotation. On the other hand, unlabeled data are inexpensive and abundant. Despite the fact that many factors limit the wide-spread use of semi-supervised classification, it has become popular since its level of performance is empirically as good as supervised classification. This study explores the possibilities and achievements as well as complexity and limitations of semi-supervised classification for several natural langue processing tasks like parsing, biomedical information processing, text classification, and summarization. △ Less

Submitted 25 September, 2014; originally announced September 2014.

arXiv:1409.7386 [pdf]

Performance of Stanford and Minipar Parser on Biomedical Texts

Authors: Rushdi Shams

Abstract: In this paper, the performance of two dependency parsers, namely Stanford and Minipar, on biomedical texts has been reported. The performance of te parsers to assignm dependencies between two biomedical concepts that are already proved to be connected is not satisfying. Both Stanford and Minipar, being statistical parsers, fail to assign dependency relation between two connected concepts if they a… ▽ More In this paper, the performance of two dependency parsers, namely Stanford and Minipar, on biomedical texts has been reported. The performance of te parsers to assignm dependencies between two biomedical concepts that are already proved to be connected is not satisfying. Both Stanford and Minipar, being statistical parsers, fail to assign dependency relation between two connected concepts if they are distant by at least one clause. Minipar's performance, in terms of precision, recall and the F-score of the attachment score (e.g., correctly identified head in a dependency), to parse biomedical text is also measured taking the Stanford's as a gold standard. The results suggest that Minipar is not suitable yet to parse biomedical texts. In addition, a qualitative investigation reveals that the difference between working principles of the parsers also play a vital role for Minipar's degraded performance. △ Less

Submitted 25 September, 2014; originally announced September 2014.

arXiv:1307.8060 [pdf, ps, other]

Extracting Information-rich Part of Texts using Text Denoising

Authors: Rushdi Shams

Abstract: The aim of this paper is to report on a novel text reduction technique, called Text Denoising, that highlights information-rich content when processing a large volume of text data, especially from the biomedical domain. The core feature of the technique, the text readability index, embodies the hypothesis that complex text is more information-rich than the rest. When applied on tasks like biomedic… ▽ More The aim of this paper is to report on a novel text reduction technique, called Text Denoising, that highlights information-rich content when processing a large volume of text data, especially from the biomedical domain. The core feature of the technique, the text readability index, embodies the hypothesis that complex text is more information-rich than the rest. When applied on tasks like biomedical relation bearing text extraction, keyphrase indexing and extracting sentences describing protein interactions, it is evident that the reduced set of text produced by text denoising is more information-rich than the rest. △ Less

Submitted 30 July, 2013; originally announced July 2013.

Comments: 26th Canadian Conference on Artificial Intelligence (CAI-2013), Regina, Canada, May 29-31, 2013

arXiv:1307.8057 [pdf]

Extracting Connected Concepts from Biomedical Texts using Fog Index

Authors: Rushdi Shams, Robert E. Mercer

Abstract: In this paper, we establish Fog Index (FI) as a text filter to locate the sentences in texts that contain connected biomedical concepts of interest. To do so, we have used 24 random papers each containing four pairs of connected concepts. For each pair, we categorize sentences based on whether they contain both, any or none of the concepts. We then use FI to measure difficulty of the sentences of… ▽ More In this paper, we establish Fog Index (FI) as a text filter to locate the sentences in texts that contain connected biomedical concepts of interest. To do so, we have used 24 random papers each containing four pairs of connected concepts. For each pair, we categorize sentences based on whether they contain both, any or none of the concepts. We then use FI to measure difficulty of the sentences of each category and find that sentences containing both of the concepts have low readability. We rank sentences of a text according to their FI and select 30 percent of the most difficult sentences. We use an association matrix to track the most frequent pairs of concepts in them. This matrix reports that the first filter produces some pairs that hold almost no connections. To remove these unwanted pairs, we use the Equally Weighted Harmonic Mean of their Positive Predictive Value (PPV) and Sensitivity as a second filter. Experimental results demonstrate the effectiveness of our method. △ Less

Submitted 30 July, 2013; originally announced July 2013.

Comments: 12th Conference of the Pacific Association for Computational Linguistics (PACLING 2011), Kuala Lumpur, Malaysia, July 19-21, 2011

arXiv:1304.2476 [pdf]

doi 10.1109/ICCCE.2010.5556854

Corpus-based Web Document Summarization using Statistical and Linguistic Approach

Authors: Rushdi Shams, M. M. A. Hashem, Afrina Hossain, Suraiya Rumana Akter, Monika Gope

Abstract: Single document summarization generates summary by extracting the representative sentences from the document. In this paper, we presented a novel technique for summarization of domain-specific text from a single web document that uses statistical and linguistic analysis on the text in a reference corpus and the web document. The proposed summarizer uses the combinational function of Sentence Weigh… ▽ More Single document summarization generates summary by extracting the representative sentences from the document. In this paper, we presented a novel technique for summarization of domain-specific text from a single web document that uses statistical and linguistic analysis on the text in a reference corpus and the web document. The proposed summarizer uses the combinational function of Sentence Weight (SW) and Subject Weight (SuW) to determine the rank of a sentence, where SW is the function of number of terms (t_n) and number of words (w_n) in a sentence, and term frequency (t_f) in the corpus and SuW is the function of t_n and w_n in a subject, and t_f in the corpus. 30 percent of the ranked sentences are considered to be the summary of the web document. We generated three web document summaries using our technique and compared each of them with the summaries developed manually from 16 different human subjects. Results showed that 68 percent of the summaries produced by our approach satisfy the manual summaries. △ Less

Submitted 9 April, 2013; originally announced April 2013.

Journal ref: Procs. of the IEEE International Conference on Computer and Communication Engineering (ICCCE10), pp. 115-120, Kuala Lumpur, Malaysia, May 11-13, (2010)

arXiv:1304.2475 [pdf]

doi 10.1109/ICCCE.2010.5556841

Design and Development of a Heart Rate Measuring Device using Fingertip

Authors: M. M. A. Hashem, Rushdi Shams, Md. Abdul Kader, Md. Abu Sayed

Abstract: In this paper, we presented the design and development of a new integrated device for measuring heart rate using fingertip to improve estimating the heart rate. As heart related diseases are increasing day by day, the need for an accurate and affordable heart rate measuring device or heart monitor is essential to ensure quality of health. However, most heart rate measuring tools and environments a… ▽ More In this paper, we presented the design and development of a new integrated device for measuring heart rate using fingertip to improve estimating the heart rate. As heart related diseases are increasing day by day, the need for an accurate and affordable heart rate measuring device or heart monitor is essential to ensure quality of health. However, most heart rate measuring tools and environments are expensive and do not follow ergonomics. Our proposed Heart Rate Measuring (HRM) device is economical and user friendly and uses optical technology to detect the flow of blood through index finger. Three phases are used to detect pulses on the fingertip that include pulse detection, signal extraction, and pulse amplification. Qualitative and quantitative performance evaluation of the device on real signals shows accuracy in heart rate estimation, even under intense of physical activity. We compared the performance of HRM device with Electrocardiogram reports and manual pulse measurement of heartbeat of 90 human subjects of different ages. The results showed that the error rate of the device is negligible. △ Less

Submitted 9 April, 2013; originally announced April 2013.

Journal ref: Procs. of the IEEE International Conference on Computer and Communication Engineering (ICCCE10), pp. 197-201, Kuala Lumpur, Malaysia, May 11-13, (2010)

arXiv:1207.3884 [pdf]

doi 10.5121/ijcseit.2012.2103

Effect of Interleaved FEC Code on Wavelet Based MC-CDMA System with Alamouti STBC in Different Modulation Schemes

Authors: Rifat Ara Shams, M. Hasnat Kabir, Sheikh Enayet Ullah

Abstract: In this paper, the impact of Forward Error Correction (FEC) code namely Trellis code with interleaver on the performance of wavelet based MC-CDMA wireless communication system with the implementation of Alamouti antenna diversity scheme has been investigated in terms of Bit Error Rate (BER) as a function of Signal-to-Noise Ratio (SNR) per bit. Simulation of the system under proposed study has been… ▽ More In this paper, the impact of Forward Error Correction (FEC) code namely Trellis code with interleaver on the performance of wavelet based MC-CDMA wireless communication system with the implementation of Alamouti antenna diversity scheme has been investigated in terms of Bit Error Rate (BER) as a function of Signal-to-Noise Ratio (SNR) per bit. Simulation of the system under proposed study has been done in M-ary modulation schemes (MPSK, MQAM and DPSK) over AWGN and Rayleigh fading channel incorporating Walsh Hadamard code as orthogonal spreading code to discriminate the message signal for individual user. It is observed via computer simulation that the performance of the interleaved coded based proposed system outperforms than that of the uncoded system in all modulation schemes over Rayleigh fading channel. △ Less

Submitted 17 July, 2012; originally announced July 2012.

Journal ref: International Journal of Computer Science, Engineering and Information Technology (IJCSEIT), Vol.2, No.1, February 2012, 23-33

arXiv:1207.3875 [pdf]

Transmission of Voice Signal: BER Performance Analysis of Different FEC Schemes Based OFDM System over Various Channels

Authors: Md. Golam Rashed, M. Hasnat Kabir, Md. Selim Reza, Md. Matiqul Islam, Rifat Ara Shams, Saleh Masum, Sheikh Enayet Ullah

Abstract: In this paper, we investigate the impact of Forward Error Correction (FEC) codes namely Cyclic Redundancy Code and Convolution Code on the performance of OFDM wireless communication system for speech signal transmission over both AWGN and fading (Rayleigh and Rician) channels in term of Bit Error Probability. The simulation has been done in conjunction with QPSK digital modulation and compared wit… ▽ More In this paper, we investigate the impact of Forward Error Correction (FEC) codes namely Cyclic Redundancy Code and Convolution Code on the performance of OFDM wireless communication system for speech signal transmission over both AWGN and fading (Rayleigh and Rician) channels in term of Bit Error Probability. The simulation has been done in conjunction with QPSK digital modulation and compared with uncoded resultstal modulation. In the fading channels, it is found via computer simulation that the performance of the Convolution interleaved based OFDM systems outperform than that of CRC interleaved OFDM system as well as uncoded OFDM channels. △ Less

Submitted 17 July, 2012; originally announced July 2012.

Comments: 12 Pages

Journal ref: International Journal of Advanced Science and Technology, Vol. 34, September 2011, 89-100

arXiv:1207.3868 [pdf]

doi 10.5121/ijmnct.2012.2301

Impact of Different Spreading Codes Using FEC on DWT Based MC-CDMA System

Authors: Saleh Masum, M. Hasnat Kabir, Md. Matiqul Islam, Rifat Ara Shams, Shaikh Enayet Ullah

Abstract: The effect of different spreading codes in DWT based MC-CDMA wireless communication system is investigated. In this paper, we present the Bit Error Rate (BER) performance of different spreading codes (Walsh-Hadamard code, Orthogonal gold code and Golay complementary sequences) using Forward Error Correction (FEC) of the proposed system. The data is analyzed and is compared among different spreadin… ▽ More The effect of different spreading codes in DWT based MC-CDMA wireless communication system is investigated. In this paper, we present the Bit Error Rate (BER) performance of different spreading codes (Walsh-Hadamard code, Orthogonal gold code and Golay complementary sequences) using Forward Error Correction (FEC) of the proposed system. The data is analyzed and is compared among different spreading codes in both coded and uncoded cases. It is found via computer simulation that the performance of the proposed coded system is much better than that of the uncoded system irrespective of the spreading codes and all the spreading codes show approximately similar nature for both coded and uncoded in all modulation schemes. △ Less

Submitted 16 July, 2012; originally announced July 2012.

Comments: 10 Pages; International Journal of Mobile Network Communications & Telematics (IJMNCT) Vol.2, No.3, June 2012

arXiv:1204.6364 [pdf]

A Corpus-based Evaluation of a Domain-specific Text to Knowledge Mapping Prototype

Authors: Rushdi Shams, Adel Elsayed, Quazi Mah-Zereen Akter

Abstract: The aim of this paper is to evaluate a Text to Knowledge Mapping (TKM) Prototype. The prototype is domain-specific, the purpose of which is to map instructional text onto a knowledge domain. The context of the knowledge domain is DC electrical circuit. During development, the prototype has been tested with a limited data set from the domain. The prototype reached a stage where it needs to be evalu… ▽ More The aim of this paper is to evaluate a Text to Knowledge Mapping (TKM) Prototype. The prototype is domain-specific, the purpose of which is to map instructional text onto a knowledge domain. The context of the knowledge domain is DC electrical circuit. During development, the prototype has been tested with a limited data set from the domain. The prototype reached a stage where it needs to be evaluated with a representative linguistic data set called corpus. A corpus is a collection of text drawn from typical sources which can be used as a test data set to evaluate NLP systems. As there is no available corpus for the domain, we developed and annotated a representative corpus. The evaluation of the prototype considers two of its major components- lexical components and knowledge model. Evaluation on lexical components enriches the lexical resources of the prototype like vocabulary and grammar structures. This leads the prototype to parse a reasonable amount of sentences in the corpus. While dealing with the lexicon was straight forward, the identification and extraction of appropriate semantic relations was much more involved. It was necessary, therefore, to manually develop a conceptual structure for the domain to formulate a domain-specific framework of semantic relations. The framework of semantic relationsthat has resulted from this study consisted of 55 relations, out of which 42 have inverse relations. We also conducted rhetorical analysis on the corpus to prove its representativeness in conveying semantic. Finally, we conducted a topical and discourse analysis on the corpus to analyze the coverage of discourse by the prototype. △ Less

Submitted 27 April, 2012; originally announced April 2012.

Comments: Journal of Computers, Academy Publishers 2010

arXiv:1204.6362 [pdf]

A Corpus-based Evaluation of Lexical Components of a Domainspecific Text to Knowledge Mapping Prototype

Authors: Rushdi Shams, Adel Elsayed

Abstract: The aim of this paper is to evaluate the lexical components of a Text to Knowledge Mapping (TKM) prototype. The prototype is domain-specific, the purpose of which is to map instructional text onto a knowledge domain. The context of the knowledge domain of the prototype is physics, specifically DC electrical circuits. During development, the prototype has been tested with a limited data set from th… ▽ More The aim of this paper is to evaluate the lexical components of a Text to Knowledge Mapping (TKM) prototype. The prototype is domain-specific, the purpose of which is to map instructional text onto a knowledge domain. The context of the knowledge domain of the prototype is physics, specifically DC electrical circuits. During development, the prototype has been tested with a limited data set from the domain. The prototype now reached a stage where it needs to be evaluated with a representative linguistic data set called corpus. A corpus is a collection of text drawn from typical sources which can be used as a test data set to evaluate NLP systems. As there is no available corpus for the domain, we developed a representative corpus and annotated it with linguistic information. The evaluation of the prototype considers one of its two main components- lexical knowledge base. With the corpus, the evaluation enriches the lexical knowledge resources like vocabulary and grammar structure. This leads the prototype to parse a reasonable amount of sentences in the corpus. △ Less

Submitted 27 April, 2012; originally announced April 2012.

Comments: 2008 IEEE International Conference on Computer and Information Technology (ICCIT 2008)

arXiv:1204.2245 [pdf]

Development of a Conceptual Structure for a Domain-Specific Corpus

Authors: Rushdi Shams, Adel Elsayed

Abstract: The corpus reported in this paper was developed for the evaluation of a domain-specific Text to Knowledge Mapping (TKM) prototype. The TKM prototype operates on the basis of both a combinatory categorical grammar (CCG) linguistic model and a knowledge model that consists of three layers: ontology, qualitative and quantitative layers. In the course of this evaluation it was necessary to populate th… ▽ More The corpus reported in this paper was developed for the evaluation of a domain-specific Text to Knowledge Mapping (TKM) prototype. The TKM prototype operates on the basis of both a combinatory categorical grammar (CCG) linguistic model and a knowledge model that consists of three layers: ontology, qualitative and quantitative layers. In the course of this evaluation it was necessary to populate these initial models with lexical items and semantic relations. Both elements, the lexicon and semantic relations, are meant to reflect the domain of the prototype; hence both had to be extracted from the corpus. While dealing with the lexicon was straight forward, the identification and extraction of appropriate semantic relations was much more involved. It was necessary, therefore, to manually develop a conceptual structure for the domain which was then used to formulate a domain-specific framework of semantic relations. The conceptual structure was developed using the Cmap tool of IHMC. The framework of semantic relations- that has resulted from this study consisted of 55 relations, out of which 42 have inverse relations. △ Less

Submitted 10 April, 2012; originally announced April 2012.

Comments: 3rd International Conference on Concept Maps (CMC2008)

arXiv:1204.2242 [pdf]

Performance Enhancement of Ad Hoc Networks with Janitor Based Routing

Authors: Isnain Siddique, Rushdi Shams, M. M. A. Hashem

Abstract: We propose and analyze a new on the fly strategy that discovers, repairs and maintains routes in hierarchical and distributed fashion called Janitor Based Routing (JBR). The main motivation behind our JBR protocol is to decrease flooding and routing overhead and increase efficiencies in packet movement. An analytical model for the proposed JBR is presented and detailed simulation is used to observ… ▽ More We propose and analyze a new on the fly strategy that discovers, repairs and maintains routes in hierarchical and distributed fashion called Janitor Based Routing (JBR). The main motivation behind our JBR protocol is to decrease flooding and routing overhead and increase efficiencies in packet movement. An analytical model for the proposed JBR is presented and detailed simulation is used to observe the performance of JBR. This route discovery and maintenance protocol clearly achieved improvement in terms of reduction of flooding, routing overhead, and, hence, provides enhanced reliability. △ Less

Submitted 10 April, 2012; originally announced April 2012.

Journal ref: 1st IEEE International Conference on Computer and Communication Engineering (ICCCE2006)

arXiv:1204.2231 [pdf, other]

Investigating Keyphrase Indexing with Text Denoising

Authors: Rushdi Shams, Robert E. Mercer

Abstract: In this paper, we report on indexing performance by a state-of-the-art keyphrase indexer, Maui, when paired with a text extraction procedure called text denoising. Text denoising is a method that extracts the denoised text, comprising the content-rich sentences, from full texts. The performance of the keyphrase indexer is demonstrated on three standard corpora collected from three domains, namely… ▽ More In this paper, we report on indexing performance by a state-of-the-art keyphrase indexer, Maui, when paired with a text extraction procedure called text denoising. Text denoising is a method that extracts the denoised text, comprising the content-rich sentences, from full texts. The performance of the keyphrase indexer is demonstrated on three standard corpora collected from three domains, namely food and agriculture, high energy physics, and biomedical science. Maui is trained using the full texts and denoised texts. The indexer, using its trained models, then extracts keyphrases from test sets comprising full texts, and their denoised and noise parts (i.e., the part of texts that remains after denoising). Experimental findings show that against a gold standard, the denoised-text-trained indexer indexing full texts, performs either better than or as good as its benchmark performance produced by a full-text-trained indexer indexing full texts. △ Less

Submitted 10 April, 2012; originally announced April 2012.

Comments: The full paper submitted to 12th ACM/ IEEE-CS Joint Conference on Digital Libraries (JCDL2012)

ACM Class: H.3.1; H.3.3; H.3.4

arXiv:0711.4406 [pdf, ps, other]

doi 10.1109/TIT.2008.2009581

Optimization of Information Rate Upper and Lower Bounds for Channels with Memory

Authors: Parastoo Sadeghi, Pascal O. Vontobel, Ramtin Shams

Abstract: We consider the problem of minimizing upper bounds and maximizing lower bounds on information rates of stationary and ergodic discrete-time channels with memory. The channels we consider can have a finite number of states, such as partial response channels, or they can have an infinite state-space, such as time-varying fading channels. We optimize recently-proposed information rate bounds for su… ▽ More We consider the problem of minimizing upper bounds and maximizing lower bounds on information rates of stationary and ergodic discrete-time channels with memory. The channels we consider can have a finite number of states, such as partial response channels, or they can have an infinite state-space, such as time-varying fading channels. We optimize recently-proposed information rate bounds for such channels, which make use of auxiliary finite-state machine channels (FSMCs). Our main contribution in this paper is to provide iterative expectation-maximization (EM) type algorithms to optimize the parameters of the auxiliary FSMC to tighten these bounds. We provide an explicit, iterative algorithm that improves the upper bound at each iteration. We also provide an effective method for iteratively optimizing the lower bound. To demonstrate the effectiveness of our algorithms, we provide several examples of partial response and fading channels, where the proposed optimization techniques significantly tighten the initial upper and lower bounds. Finally, we compare our results with an improved variation of the \emph{simplex} local optimization algorithm, called \emph{Soblex}. This comparison shows that our proposed algorithms are superior to the Soblex method, both in terms of robustness in finding the tightest bounds and in computational efficiency. Interestingly, from a channel coding/decoding perspective, optimizing the lower bound is related to increasing the achievable mismatched information rate, i.e., the information rate of a communication system where the decoder at the receiver is matched to the auxiliary channel, and not to the original channel. △ Less

Submitted 27 November, 2007; originally announced November 2007.

Comments: Submitted to IEEE Transactions on Information Theory, November 24, 2007

Showing 1–28 of 28 results for author: Shams, R