For enquiries call:

+1-469-442-0620

banner-in1

10 Current Database Research Topic Ideas in 2024

Home Blog Database 10 Current Database Research Topic Ideas in 2024

Play icon

As we head towards the second half of 2024, the world of technology evolves at a rapid pace. With the rise of AI and blockchain, the demand for data, its management and the need for security increases rapidly. A logical consequence of these changes is the way fields like database security research topics and DBMS research have come up as the need of the hour.

With new technologies and techniques emerging day-by-day, staying up-to-date with the latest trends in database research topics is crucial. Whether you are a student, researcher, or industry professional, we recommend taking our Database Certification courses to stay current with the latest research topics in DBMS.

In this blog post, we will introduce you to 10 current database research topic ideas that are likely to be at the forefront of the field in 2024. From blockchain-based database systems to real-time data processing with in-memory databases, these topics offer a glimpse into the exciting future of database research.

So, get ready to dive into the exciting world of databases and discover the latest developments in database research topics of 2024!

Blurring the Lines between Blockchains and Database Systems 

The intersection of blockchain technology and database systems offers fertile new grounds to anyone interested in database research.

As blockchain gains popularity, many thesis topics in DBMS[1] are exploring ways to integrate both fields. This research will yield innovative solutions for data management. Here are 3 ways in which these two technologies are being combined to create powerful new solutions:

Immutable Databases: By leveraging blockchain technology, it’s possible to create databases to be immutable. Once data has been added to such a database, it cannot be modified or deleted. This is particularly useful in situations where data integrity is critical, such as in financial transactions or supply chain management.

Decentralized Databases: Blockchain technology enables the creation of decentralized databases. Here data is stored on a distributed network of computers rather than in a central location. This can help to improve data security and reduce the risk of data loss or corruption.

Smart Contracts: Smart contracts are self-executing contracts with the terms of the agreement between buyer and seller being directly written into lines of code. By leveraging blockchain technology, it is possible to create smart contracts that are stored and executed on a decentralized database, making it possible to automate a wide range of business processes.

Childhood Obesity: Data Management 

Childhood obesity is a growing public health concern, with rates of obesity among children and adolescents rising around the world. To address this issue, it’s crucial to have access to comprehensive data on childhood obesity. Analyzing information on prevalence, risk factors, and interventions is a popular research topic in DBMS these days.

Effective data management is essential for ensuring that this information is collected, stored, and analyzed in a way that is useful and actionable. This is one of the hottest DBMS research paper topics. In this section, we will explore the topic of childhood obesity data management.

A key challenge to childhood obesity data management is ensuring data consistency. This is difficult as various organizations have varied methods for measuring and defining obesity. For example:

Some may use body mass index (BMI) as a measure of obesity.

Others may use waist circumference or skinfold thickness.   Another challenge is ensuring data security and preventing unauthorized access. To protect the privacy and confidentiality of individuals, it is important to ensure appropriate safeguards are in place. This calls for database security research and appropriate application.

Application of Computer Database Technology in Marketing

Leveraging data and analytics allows businesses to gain a competitive advantage in this digitized world today. With the rising demand for data, the use of computer databases in marketing has gained prominence.

The application of database capabilities in marketing has really come into its own as one of the most popular and latest research topics in DBMS[2]. In this section, we will explore how computer database technology is being applied in marketing, and the benefits this research can offer.

Customer Segmentation: Storage and analysis of customer data makes it possible to gain valuable insights. It allows businesses to identify trends in customer behavior, preferences and demographics. This information can be utilized to create highly targeted customer segments. This is how businesses can tailor their marketing efforts to specific groups of customers.

Personalization: Computer databases can be used to store and analyze customer data in real-time. In this way, businesses can personalize their marketing and offers based on individual customer preferences. This can help increase engagement and loyalty among customers, thereby driving greater revenue for businesses.

Predictive Analytics: Advanced analytics techniques such as machine learning and predictive modeling can throw light on patterns in customer behavior. This can even be used to predict their future actions. This information can be used to create more targeted marketing campaigns, and to identify opportunities for cross-selling and upselling.

Database Technology in Sports Competition Information Management

Database technology has revolutionized the way in which sports competition information is managed and analyzed. With the increasing popularity of sports around the world, there is a growing need for effective data management systems that can collect, store, and analyze large volumes of relevant data. Thus, researching database technologies[3] is vital to streamlining operations, improving decision-making, and enhancing the overall quality of events.

Sports organizations can use database technology to collect and manage a wide range of competition-related data such as: 

Athlete and team information,

competition schedules and results,

performance metrics, and

spectator feedback.

Collating this data in a distributed database lets sports organizations easily analyze and derive valuable insights. This is emerging as a key DBMS research paper topic.

Database Technology for the Analysis of Spatio-temporal Data

Spatio-temporal data refers to data which has a geographic as well as a temporal component. Meteorological readings, GPS data, and social media content are prime examples of this diverse field. This data can provide valuable insights into patterns and trends across space and time. However, its multidimensional nature makes analysis be super challenging. It’s no surprise that this has become a hot topic for distributed database research[4].

In this section, we will explore how database technology is being used to analyze spatio-temporal data, and the benefits this research offers.

Data Storage and Retrieval: Spatio-temporal data tends to be very high-volume. Advances in database technology are needed to make storage, retrieval and consumption of such information more efficient. A solution to this problem will make such data more available. It will then be easily retrievable and usable by a variety of data analytics tools.

Spatial Indexing: Database technology can create spatial indexes to enable faster queries on spatio-temporal data. This allows analysts to quickly retrieve data for specific geographic locations or areas of interest, and to analyze trends across these areas.

Temporal Querying: Distributed database research can also enable analysts to analyze data over specific time periods. This facilitates the identification of patterns over time. Ultimately, this enhances our understanding of how these patterns evolve over various seasons.

Artificial Intelligence and Database Technology

Artificial intelligence (AI) is another sphere of technology that’s just waiting to be explored. It hints at a wealth of breakthroughs which can change the entire world. It’s unsurprising that the combination of AI with database technology is such a hot topic for database research papers[5] in modern times. 

By using AI to analyze data, organizations can identify patterns and relationships that might not be apparent through traditional data analysis methods. In this section, we will explore some of the ways in which AI and database technology are being used together. We’ll also discuss the benefits that this amalgamation can offer.

Predictive Analytics: By analyzing large volumes of organizational and business data, AI can generate predictive models to forecast outcomes. For example, AI can go through customer data stored in a database and predict who is most likely to make a purchase in the near future.

Natural Language Processing: All businesses have huge, untapped wells of valuable information in the form of customer feedback and social media posts. These types of data sources are unstructured, meaning they don’t follow rigid parameters. By using natural language processing (NLP) techniques, AI can extract insights from this data. This helps organizations understand customer sentiment, preferences and needs.

Anomaly Detection: AI can be used to analyze large volumes of data to identify anomalies and outliers. Then, a second round of analysis can be done to pinpoint potential problems or opportunities. For example, AI can analyze sensor data from manufacturing equipment and detect when equipment is operating outside of normal parameters.

Data Collection and Management Techniques of a Qualitative Research Plan

Any qualitative research calls for the collection and management of empirical data. A crucial part of the research process, this step benefits from good database management techniques. Let’s explore some thesis topics in database management systems[6] to ensure the success of a qualitative research plan.

Interviews: This is one of the most common methods of data collection in qualitative research. Interviews can be conducted in person, over the phone, or through video conferencing. A standardized interview guide ensures the data collected is reliable and accurate. Relational databases, with their inherent structure, aid in this process. They are a way to enforce structure onto the interviews’ answers.

Focus Groups: Focus groups involve gathering a small group of people to discuss a particular topic. These generate rich data by allowing participants to share their views in a group setting. It is important to select participants who have knowledge or experience related to the research topic.

Observations: Observations involve observing and recording events in a given setting. These can be conducted openly or covertly, depending on the research objective and setting. To ensure that the data collected is accurate, it is important to develop a detailed observation protocol that outlines what behaviors or events to observe, how to record data, and how to handle ethical issues.

Database Technology in Video Surveillance System 

Video surveillance systems are used to monitor and secure public spaces, workplaces, even homes. With the increasing demand for such systems, it’s important to have an efficient and reliable way to store, manage and analyze the data generated. This is where database topics for research paper [7] come in.

By using database technology in video surveillance systems, it is possible to store and manage large amounts of video data efficiently. Database management systems (DBMS) can be used to organize video data in a way that is easily searchable and retrievable. This is particularly important in cases where video footage is needed as evidence in criminal investigations or court cases.

In addition to storage and management, database technology can also be used to analyze video data. For example, machine learning algorithms can be applied to video data to identify patterns and anomalies that may indicate suspicious activity. This can help law enforcement agencies and security personnel to identify and respond to potential threats more quickly and effectively.

Application of Java Technology in Dynamic Web Database Technology 

Java technology has proven its flexibility, scalability, and ease of use over the decades. This makes it widely used in the development of dynamic web database applications. In this section, we will explore research topics in DBMS[8] which seek to apply Java technology in databases.

Java Server Pages (JSP): JSP is a Java technology that is used to create dynamic web pages that can interact with databases. It allows developers to embed Java code within HTML scripts, thereby enabling dynamic web pages. These can interact with databases in real-time, and aid in data collection and maintenance.

Java Servlets: Java Servlets are Java classes used to extend the functionality of web servers. They provide a way to handle incoming requests from web browsers and generate dynamic content that can interact with databases.

Java Database Connectivity (JDBC): JDBC is a Java API that provides a standard interface for accessing databases. It allows Java applications to connect to databases. It can SQL queries to enhance, modify or control the backend database. This enables developers to create dynamic web applications.

Online Multi Module Educational Administration System Based on Time Difference Database Technology 

With the widespread adoption of remote learning post-COVID, online educational systems are gaining popularity at a rapid pace. A ubiquitous challenge these systems face is managing multiple modules across different time zones. This is one of the latest research topics in database management systems[9].

Time difference database technology is designed to handle time zone differences in online systems. By leveraging this, it’s possible to create a multi-module educational administration system that can handle users from different parts of the world, with different time zones.

This type of system can be especially useful for online universities or other educational institutions that have a global reach:

It makes it possible to schedule classes, assignments and other activities based on the user's time zone, ensuring that everyone can participate in real-time.

In addition to managing time zones, a time difference database system can also help manage student data, course materials, grades, and other important information.

Why is it Important to Study Databases?

Databases are the backbone of many modern technologies and applications, making it essential for professionals in various fields to understand how they work. Whether you're a software developer, data analyst or a business owner, understanding databases is critical to success in today's world. Here are a few reasons why it is important to study databases and more database topics for research paper should be published:

Efficient Data Management

Databases enable the efficient storage, organization, and retrieval of data. By studying databases, you can learn how to design and implement effective data management systems that can help organizations store, analyze, and use data efficiently.

Improved Decision-Making

Data is essential for making informed decisions, and databases provide a reliable source of data for analysis. By understanding databases, you can learn how to retrieve and analyze data to inform business decisions, identify trends, and gain insights.

Career Opportunities

In today's digital age, many career paths require knowledge of databases. By studying databases, you can open up new career opportunities in software development, data analysis, database administration and related fields.

Needless to say, studying databases is essential for anyone who deals with data. Whether you're looking to start a new career or enhance your existing skills, studying databases is a critical step towards success in today's data-driven world.

Final Takeaways

In conclusion, as you are interested in database technology, we hope this blog has given you some insights into the latest research topics in the field. From blockchain to AI, from sports to marketing, there are a plethora of exciting database topics for research papers that will shape the future of database technology.

As technology continues to evolve, it is essential to stay up-to-date with the latest trends in the field of databases. Our curated KnowledgeHut Database Certification Courses will help you stay ahead of the curve and develop new skills.

We hope this blog has inspired you to explore the exciting world of database research in 2024. Stay curious and keep learning!

Frequently Asked Questions (FAQs)

There are several examples of databases, with the five most common ones being:

MySQL : An open-source RDBMS used commonly in web applications.

Microsoft SQL Server : A popular RDBMS used in enterprise environments.

Oracle : A trusted commercial RDBMS famous for its high-scalability and security.

MongoDB : A NoSQL document-oriented database optimized for storing large amounts of unstructured data.

PostgreSQL : An open-source RDBMS offering advanced features like high concurrency and support for multiple data types.

Structured Query Language (SQL) is a high-level language designed to communicate with relational databases. It’s not a database in and of itself. Rather, it’s a language used to create, modify, and retrieve data from relational databases such as MySQL and Oracle.

A primary key is a column (or a set of columns) that uniquely identifies each row in a table. In technical terms, the primary key is a unique identifier of records. It’s used as a reference to establish relationships between various tables.

Profile

Spandita Hati

Spandita is a dynamic content writer who holds a master's degree in Forensics but loves to play with words and dabble in digital marketing. Being an avid travel blogger, she values engaging content that attracts, educates and inspires. With extensive experience in SEO tools and technologies, her writing interests are as varied as the articles themselves. In her leisure, she consumes web content and books in equal measure.

Avail your free 1:1 mentorship session.

Something went wrong

Upcoming Database Batches & Dates

Chat icon for mobile

Database Management Systems (DBMS)

Database group website: db.cs.berkeley.edu

Declarative languages and runtime systems

Design and implementation of declarative programming languages with applications to distributed systems, networking, machine learning, metadata management, and interactive visualization; design of query interface for applications.

Scalable data analysis and query processing

Scalable data processing in new settings, including interactive exploration, metadata management, cloud and serverless environments, and machine learning; query processing on compressed, semi-structured, and streaming data; query processing with additional constraints, including fairness, resource utilization, and cost.

Consistency, concurrency, coordination and reliability

Coordination avoidance, consistency and monotonicity analysis; transaction isolation levels and protocols; distributed analytics and data management, geo-replication; fault tolerance and fault injection.

Data storage and physical design

Hot and cold storage; immutable data structures; indexing and data skipping; versioning; new data types; implications of hardware evolution.

Metadata management

Data lineage and versioning; usage tracking and collective intelligence; scalability of metadata management services; metadata representations; reproducibility and debugging of data pipelines.

Systems for machine learning and model management

Distributed machine learning and graph analytics; physical and logical optimization of machine learning pipelines; online model management and maintenance; prediction serving; real-time personalization; latency-accuracy tradeoffs and edge computing for large-scale models; machine learning lifecycle management.

Data cleaning, data transformation, and crowdsourcing

Human-data interaction including interactive transformation, query authoring, and crowdsourcing; machine learning for data cleaning; statistical properties of data cleaning pipelines; end-to-end systems for crowdsourcing.

Interactive data exploration and visualization

Interactive querying and direct manipulation; scalable spreadsheets and data visualization; languages and interfaces for interactive exploration; progressive query visualization; predictive interaction.

Secure data processing

Data processing under homomorphic encryption; data compression and encryption; differential privacy; oblivious data processing; databases in secure hardware enclaves.

Foundations of data management

Optimal trade-offs between storage, quality, latency, and cost, with applications to crowdsourcing, distributed data management, stream data processing, version management; expressiveness, complexity, and completeness of data representations, query languages, and query processing; query processing with fairness constraints.

Research Centers

  • EPIC Data lab
  • Sky Computing Lab
  • Alvin Cheung
  • Natacha Crooks
  • Joseph Gonzalez
  • Joseph M. Hellerstein (coordinator)
  • Jiantao Jiao
  • Aditya Parameswaran
  • Matei Zaharia
  • Eric Brewer
  • Michael Lustig
  • Jelani Nelson

Faculty Awards

  • ACM Prize in Computing: Eric Brewer, 2009.
  • National Academy of Engineering (NAE) Member: Ion Stoica, 2024. Eric Brewer, 2007.
  • American Academy of Arts and Sciences Member: Eric Brewer, 2018.
  • Sloan Research Fellow: Aditya Parameswaran, 2020. Alvin Cheung, 2019. Jelani Nelson, 2017. Michael Lustig, 2013. Ion Stoica, 2003. Joseph M. Hellerstein, 1998. Eric Brewer, 1997.

Related Courses

  • CS 186. Introduction to Database Systems
  • CS 262A. Advanced Topics in Computer Systems

CSE 5249 - Research Topics in Database Management Systems

Time: Tuesdays & Thursdays, 3:55PM - 4:50PM Room: Dreese Labs 295

Instructor: Spyros Blanas, [lastname][email protected], office hours by appointment.

I will notify all students via e-mail when this webpage is updated and I will list every update here:

SEP 28 UPDATE: Small changes about who leads each paper discussion in the schedule.

NOV 3 UPDATE: Added paper summaries for remaining papers; please find instructions below. Assigned presentation slots.

Course description

This seminar focuses on recent research results in the intersection of data management and systems. There is no formal textbook for this course. We will mostly be reading and discussing recently published papers in venues such as SIGMOD, VLDB and ICDE. An important component of the course is an individual research project, where you will pick one topic of interest in the area of database management systems and explore it in depth.

This course mainly discusses the latest research findings on data management and builds on the foundations that have been introduced in the CSE 5242, the Advanced Database Management Systems course. If you are not motivated to study and conduct independent research, this course does not have a structure to guide you to success (such as a textbook, exams, or help from a GTA).

  • Ph.D. students in any group who need this background for their research.
  • Ph.D. or M.Sc. students who intend to work in the Data Management & Mining group towards a Ph.D. dissertation or an M.Sc. thesis.
  • M.Sc. or B.Sc. students that have done very well in CSE 5242 and are curious about recent research topics in data management. Many students in this category take CSE 5249 after they have accepted a job offer that involves building a data management system or service and want to learn about ideas that have not yet appeared in mainstream products.

Check Carmen for PDF versions of the papers.

Prerequisites

This course builds on the material introduced in CSE 3242/5242, the Advanced Database Management Systems course.

This course has two main components, as follows:

Paper summaries

In order to make the most of our in-class time, you are expected to submit a summary of the assigned reading before each class. For all questions, don't paraphrase (or copy verbatim) what is written in the paper. Papers frequently have different contributions than their authors claimed when they were writing them.

Paper summaries will be graded on a scale from zero to two. Zero is reserved for summaries that have not been submitted or are unreadable. One reflects a summary that can be improved, either for length, clarity or insight. Two represents a solid effort at summarizing the paper. One bonus point will be given to a few exceptionally insightful summaries.

Each summary must answer exactly the following questions. Remember that summaries are graded on clarity and insight, and not their length!

Answers to all questions are due by 1am on the day the paper is discussed . Upload your answers to Carmen as a single plaintext file. Please include the questions in the submitted file. No Microsoft Word or Adobe Acrobat files will be accepted.

Class project

You will also work in an individual research project at a topic of mutual research interest. (Group projects will not be allowed.) I can provide a list of ideas on interesting topics and discuss about any ideas you have.

It is your responsibility to meet with the instructor periodically throughout the semester to discuss the general direction and the progress of the class project. You must take the initiative to actively explore the topic you choose, or else you will not accomplish much in the project. As a consequence, your class project grade will be adversely impacted.

  • What is the problem you are solving?
  • What have others done already to solve this or a similar problem?
  • What is your solution, and what did you accomplish during the last three months?
  • What are the results? Does your solution improve over what prior work has already accomplished?
  • In retrospective, what could you have done better in this project?
  • If someone else looks at this problem in the future, what are the aspects of the problem that you did not have time to explore?

Source code: Before submitting your source code, please delete any intermediate files and executable binaries. (These will not work in any other platform but your own system.) If you have worked with a large codebase (PostgreSQL, Impala, MySQL, etc.) please only submit a diff of your changes, and include a reference to what is the "base" version you modified. Examples include "PostgreSQL 9.4.0", or "Linux 3.x development branch, git commit f3f62a38ce".

If the source code is small (a few MBs), please upload it with your report on Carmen. Ohio State offers BuckeyeBox , a version of the Box file sharing service, for this purpose which you can access using your Carmen credentials. It is not necessary to use this service, as long as you include a link to your source code.

Classroom etiquette

Please do not use phones, tablets, laptops, or other non-technological distractions.

Academic conduct

Advances in Databases and Information Systems

  • Published: 27 December 2017
  • Volume 20 , pages 1–6, ( 2018 )

Cite this article

  • Ladjel Bellatreche 1 ,
  • Patrick Valduriez 2 &
  • Tadeusz Morzy 3  

6035 Accesses

9 Citations

Explore all metrics

Avoid common mistakes on your manuscript.

1 Introduction

The success stories of the database technology keep companies continuously demanding more and more efficient services related to two principal entities: data and queries . The services dedicated to data concern mainly the following tasks: collecting, cleaning, filtering, integration, sharing, storing, transferring, visualizing, analyzing, securing. Regarding queries, services are often concentrated on processing, optimizing, personalizing and recommending. These services of both entities have to be revisited and extended to deal with the dimensions brought by Big Data Era and the new requirements of companies in the contexts of globalization, competition and climate change.

While (Huang et al. 2017 ) point out the seven V’s of Big Data, we would like to highlight four dimensions characterized by four V’s of Big Data, that challenges the traditional solutions of databases, in terms of data acquisition, storage, management and analysis. The first V concerns the Volume of data generated by traditional and new providers. To illustrate the data deluge, let consider three examples of data providers: Footnote 1 (i) the massive use of sensors (e.g. 10 Terabyte of data are generated by planes every 30 minutes), (ii) the massive use of social networks (e.g., 340 million tweets per day), (iii) transactions (Walmart handles more than 1 million customer transactions every hour, which is imported into databases estimated to contain more than 2.5 Peta-bytes of data). The second V is associated to the Variety, where data may come from various data sources, in different formats such as transactions, log data, social network, sensors, etc. from various applications, structured data as database table, semi-structured data such as XML data, unstructured data such as text, images, video streams, audio statement, and more. The third V is about the Velocity, where large amounts of data from transactions with high refresh rate resulting in data streams coming at great speed. The time to act on the basis of these data streams will often be very short. Consequently, the traditional batch processing has to move to real time streaming. The fourth V that got little attention is related to Vocabulary used to describe schemes, data models, semantics, ontologies, taxonomies, and other content- and context-based metadata that describe the data’s structure, syntax, content, and provenance.

The traditional database non-functional requirements were mainly focused on (i) improving low-latency query processing to satisfy the needs of end users (e.g., decision makers), (ii) minimizing the maintenance cost of the databases and optimization structures (e.g. indexes and materialized views) and (iii) better usage of the storage cost dedicated to store the optimization structures have been considered. These requirements have been enriched by new ones; mainly motivated by the development of green computing and services delivered by Cloud Computing. This for some years now, the international community regrouping states, governments, associations, and users has been closely involved in climate change by proposing initiatives to limit global warming. The database community has spared no effort propose initiatives in this sense. As a consequence, the reduction of energy has become a new non-functional requirement integrated in the processes of design and the exploitation of database and information systems (Roukh et al. 2017 ). The Claremont report on database research states the importance of designing power-aware DBMSs that limit energy costs without sacrificing scalability. This is also echoed in the more recent Beckman report on databases, which considers the energy constrained processing as a challenging issue in Big Data (Abadi et al. 2016 ). Another non-functional requirement that emerges with the development of Cloud computing is the pricing (Pay-as-You-Go) (Toosi et al. 2016 ). This is because Cloud computing providers offer numerous on-line services based on SLA (Service Level Agreement) between them and their customers.

Considerations above open the door to innovative research directions and challenges for the database research community, yet exploiting two opportunities: (i) computational and storage resource modeling and organization; (ii) Big Data programming models and (iii) processing power. This can be accomplished by means of actual powerful hardware and infrastructures as well as new programming models available at now.

Opportunity 1

The database storage systems are evolving towards decentralized commodity clusters that can scale in terms of capacity, processing power, and network throughput. The efforts that have been deployed to design such systems share simultaneously physical resources and data between applications. Cloud computing largely contributed in augmenting sharing capabilities of these systems thanks to their nice characteristics: elasticity , flexibility , on-demand storage and computing services .

Opportunity 2

Despite the data deluge and spectacular power of machines, traditional programming languages show their limitations. A new generation of programming models has been proposed, known by Big Data programming models (Wu et al. 2017 ). They represent the style of programming and present the interface paradigm for developers to write big data application programs (Wu et al. 2017 ). A nice classification of these programming paradigms is given in Wu et al. ( 2017 ). The authors distinguished eight classes of programming models: (1) Mapreduce (e.g. Hadoop), (2) functional (e.g. Spark, Flink), (3) SQL-based (e.g. HiveQL, SparkSQL), Actor (e.g., Akka, Storm), (5) statistical and analytical (e.g. R, Mahout), (6) dataflow (e.g. Oozie, Dryad), (7) Bulk Synchronous Parallel (BSP) (e.g. Giraph, Hama) and hogh-level DSL (e.g. Pig Latin, Linq).

Opportunity 3

Traditionally, storage systems managed databases that were primarily stored on secondary storage and only a small part of the data could fit in main memory. Therefore, disk Input-Output (IO) was the dominating cost factor. Nowadays, it is possible to equip servers with several terabytes of main memory, which allows us to keep databases in main memory to avoid the IO bottleneck (Arnold et al. 2014 ). Therefore, the performance of databases became limited by memory access and processing power (Breß et al. 2014 ). Many heterogeneous devices (e.g. GPU, FPGA, APU) are available and can be used in parallel in order to process database operations, where each processor is optimized for a certain application scenario (Breß et al. 2014 , 2016 ).

The database community, including researchers, industrials and funding organizations has spared no effort to conduct advanced research, by exploiting the achievements and above opportunities, to satisfy the requirements of companies in terms of data management and exploitation.

In the rest of this article, we first discuss some specific research challenges around the databases. We also present, in Section  3 , some of the latest development in this research area. We particularly report four research papers addressing different aspects of the database challenges.

2 Databases Research Challenges

The database and information systems technologies raise a good number of research challenges and in this section, we will discuss some of the most important ones, which are related to ADBIS conference, which represents the origin of this special section.

The ADBIS conference is one of concrete examples of these efforts. The ADBIS conference has been widely accepted as a key technology for enterprises and organizations to improve their abilities in data modeling, data management, data exploitation and information systems. The ADBIS conference has attracted the international interest of the research community and is being mentioned in several ranking lists and indexed in several digital libraries (e.g. DBLP, Google Scholar, Microsoft Academic Search).

The first challenge we would like to discuss is the management of evolution of new advanced databases. Methodologies and techniques used for designing advanced databases (data integration systems, data warehouses, data marts, etc.), research developments, and most of the commercially available technologies tacitly assumed that a semantic integration system is static (Bellatreche and Wrembel 2013 ). In practice, however, this assumption turned out to be false. An advanced database system requires changes among others as the result of: (1) the evolution of data sources, (2) changes of the real world represented in an integration system, (3) the evolution of domain ontologies and knowledge bases (such as DBpedia, FreeBase, Linked Open Data Cloud, Yago, etc.) usually involved in the construction of these databases, (4) new user requirements, and (5) creating simulation scenarios (what-if analysis).

As reported in the literature, structures of data sources change frequently. For example, during the last 4 years, the schema of Wikipedia changed every 9–10 days, on the average. From our experience, schemas of data sources may change even more frequently. For example, telecommunication data sources changed their schemas every 7–13 days, on the average. Banking data sources are more stable but they changed their schemas every 2–4 weeks, on the average. Changes in the structures of data sources impact all the layers of advanced database systems. Since such changes are frequent, developing solutions and tools for handling them automatically or semi-automatically is of high practical importance.

The data cleaning , also called data cleansing or scrubbing represents a serious challenge when integrating data from various sources (Rahm and Do 2000 ). It aims at detecting and removing errors and inconsistencies from data in order to improve the quality of data. Data cleaning is especially required when integrating heterogeneous data sources and should be addressed together with schema-related data transformations. In data warehouses, data cleaning is a major part of the so-called ETL (Extract, Transform, Load) process. The first task of an ETL process is to extract data from multiple data sources, typically into a Data Staging Area (DSA). Once data are available in a DSA, the second phase is to perform data quality checks and transformations in order to make data clean and consistent with the structure of a materialized integration system. Finally, the third phase is to load data into an integration system. The ETL process is error-prone and cause significant downtime; especially, when it deals with a deluge of data issued from various and heterogeneous data sources. The causes of these errors concern mainly availability of data sources, reading, writing, incomplete data, duplicate, system crash or power outage machine hosted ETL, etc. The development of rigorous of testing procedure before the loading task contributes in increasing the quality of data integration systems.

A connected issue to the ETL quality concern its performance. It should be noticed that ETL processes are performed outside DBMS. An ETL process is typically implemented as a work-flow, where various tasks (a.k.a. activities or operations), which process data, are connected by data flows (Ali and Wrembel 2017 ). Optimizing this work-flow is one of important challenges for ETL research community, especially, in Big Data Era, where various sources are involved in the integration process, with a deluge of data. The recourse to Big Data programming models are highly recommended (Ali and Wrembel 2017 ).

The fourth challenge is related the vocabulary brought by Big Data. The vocabulary has a great role in designing new applications requiring semantics. Let us consider the example of an emerging topic which is the Cultural Heritage that faces an array of challenges. Scientific researchers, organizations, associations, schools are looking for relevant technologies for accessing, integrating, sharing, annotating, visualizing, analyzing the mine of cultural collections by considering profiles and preferences of end users. Most cultural information systems today process data based on the syntactic level without leveraging the rich semantic structures underlying the content. Moreover, they use multiple thesauri without a formal connection between them. This situation has been identified in the 90’s when the need to build a unique interface to access huge collection of data has appeared. During the last decades, Semantic Web solutions have been proposed to explicit the semantic of data sources and make their content machine understandable and interoperable. Efforts on integrating the Web Semantic technology in the Cultural Heritage have to be taken, by constructing adequate ontologies and vocabularies (Markhoff et al. 2017 ).

Finally, another challenge that has to be addressed is the development of cost models evaluating the non-functional requirements. Generally speaking, a cost model ( \(\mathcal {CM}\) ) can be seen as a mathematical function with input parameters and as an output the value of the measured cost in terms of response time, size and/or energy. A \(\mathcal {CM}\) has five main roles: (i) it selects the best query plans (Andrès et al. 1995 ), (ii) it guides algorithms to select optimization structures such as indexes, materialized views, etc. Bellatreche et al. ( 2000 ), (iii) it is used to deploy a database in advanced platforms (parallel, Cloud, etc.) Kunjir et al. ( 2017 ), (iv) it is used by advisory tools to assist database administrators in various of systems tuning and physical design (Chaudhuri and Narasayya 1998 ) and (v) self-driving database management systems (Pavlo et al. 2017 ). A \(\mathcal {CM}\) is difficult to develop, since it includes several parameters belonging to databases, platforms, DBMS, queries, devices, etc. Ouared et al. ( 2016 ). The evolution of non-functional requirements, databases, processing devises, storage systems, etc. pushes the database community to propose methodologies and guidelines to construct, calibrate and validate \(\mathcal {CM}\) s.

3 Papers in this Special Section

We got 14 papers for our special section distributed as follows:

two papers from the main conference ADBIS (Morzy et al. 2015 );

the best paper of each ADBIS workshop. It should be noticed that MEBIS and GIG Workshops were merged. In total, we got 6 papers from ADBIS workshops;

six papers from our open call.

Authors of selected papers (from ADBIS main conference and its workshops) were invited to submit an extended version with at least 30% difference in technical content. These papers were evaluated by at least two reviewers. After a second round of reviews, we finally accepted four papers. Thus, the relative acceptance rate for the papers included in this special section is competitive (28.5%). Needless to say, these four papers represent innovative and high quality research. The topics of these accepted papers are very timely and include: Big Data Applications and Principles, Evolving Business Intelligence Systems, Cultural Heritage Preservation and Enhancement and database evolution management.

We congratulate the authors of these four papers and thank all authors who submitted articles to ADBIS 2014 and our special section. It should be noticed that certain papers used case studies issued from international projects funded either by European Commission or German Research Foundation.

The four selected papers are summarized as follows:

The first paper titled: Evaluating Queries and Updates on Big XML Documents by Carlo Sartiani, Nicole Bidoit, Dario Colazzo and Noor Malla presents Andromeda – a system able to execute a subcategory of iterative and update XQuery Queries over MapReduce (Sartiani et al. 2018 ). This subcategory is identified as queries that iterate over forward XPaths, and can therefore easily be distributed by partitioning the document according to these paths. The authors described the global architecture, the basic principle and the used algorithms of Andromeda . The basic idea of this system is to dynamically and/or statically partition the input data to leverage on the parallelism of a Map/Reduce cluster and to increase the scalability. A great effort on formalization of iterative XQuery queries and updates has to be highlighted. Two partitioning algorithms are given for iterative XQuery queries and updates, respectively. Intensive experiments have been conducted on a multi-tenant cluster composed of a single master machine and 100 slave machines. Two distinct datasets have been used covering iterative and updates queries. The proposal is compared against the existing systems such as BaseX.

The second paper titled: Dependency Modelling for Inconsistency Management in Digital Preservation - The PERICLES Approach , by Nikolaos Lagos, Marina Riga, Panagiotis Mitzias, Jean-Yves Vion-Dury, Efstratios Kontopoulos, Simon Waddington, Georgios Meditskos, Pip Laurenson, and Ioannis Kompatsiaris presents an important aspect of the PERICLES project: the Linked Resource Model (LRM). A conceptual model to handle contextual and environmental constraints, focusing on the preservation of cultural heritage is presented (Lagos et al. 2018 ). In the proposed approach, models provide an abstract representation of essential aspects of the environment. The goal is to model details of dependencies concerning components of the artwork. For instance, a dependency between MS Visio and JPEG objects. The LRM provides concepts that allow recording when a change is triggered and its impact on other entities. Here the LRM is extended by links to other existing ontologies giving rise to Digital Video Art (DVA), a domain-specific model. Examples of automatic reasoning and handling of inconsistencies are given, based on the use of SPIN. The presented case study concerns the preservation of video art and considers the problem of how technology changes are managed and traced over time. The particularity of this paper is its usage of a domain-independent ontology (LRM), combined with domain-specific ontology, to model changes to video art. It also applied to in detecting inconsistencies. This work presents one of the key outcomes of the PERICLES FP7 project ( http://pericles-project.eu/ ).

The third paper entitled Robust and Simple Database Evolution , by Kai Herrmann, Hannes Voigt, Jonas Rausch, Andreas Behrend and Wolfgang Lehner presents a domain specific language (DSL), named CoDEL, to support the database evolution process. Apart from the language concepts and syntax, the authors prove that their language is as expressive as the relational algebra. The paper starts from a very important and real supposition that, nowadays, the process of software evolution including updates to meet frequent changes of the user requirements is much better supported than the process of database evolution. The main argument given by the authors of this paper is based on the fact that the database evolution process often presents a major bottleneck in the whole software evolution process (Herrmann et al. 2018 ). CoDEL is a well-defined relationally complete database evolution language and consequently, it can serve as a reference language for productive implementations of database evolution in DBMSs or as foundation for further support of database evolution. As an example, the authors present semi-automatic variant co-evolution. This work is funded by the German Research Foundation (Deutsche Forschungsgemeinschaft; DFG) within the RoSI research training group (GRK 1907).

The fourth paper titled ETL Workflow Reparation by Means of Case-Based Reasoning , by Artur Wojciechowski presents a topic of high interest dealing with the following problem: How to cope with data source evolution in the ETL context? The paper proposes a prototype ETL framework, called E-ETL for repairing ETL workows semi automatically when structural changes in data sources occur (Wojciechowski 2018 ). The framework is based on a Case-Based Reasoning method. It consists of two main algorithms: (1) Case Detection Algorithm and (2) Best Case Searching Algorithm for choosing the most appropriate case. A test case scenario is presented for illustrating the approach. An experimental evaluation of the approach with respect to its performance is presented, for 6 different factors influencing the performance.

http://www.economist.com/node/15579717

Abadi, D., Agrawal, R., Ailamaki, A., Balazinska, M., Bernstein, P.A., Carey, M.J., & Widom, J. (2016). The beckman report on database researchs. Communications of the ACM , 59 (2), 92–99.

Article   Google Scholar  

Ali, S.M.F., & Wrembel, R. (2017). From conceptual design to performance optimization of etl workflows: current state of research and open problems. The VLDB Journal.

Andrès, F., Kwakkel, F., & Kersten, M.L. (1995). Calibration of a DBMS cost model with the software testpilot. In CISMOD (pp. 58–74).

Arnold, O., Haas, S., Fettweis, G., Schlegel, B., Kissinger, T., & Lehner, W. (2014). An application-specific instruction set for accelerating setoriented database primitives. In Acm sigmod (pp. 767–778).

Bellatreche, L., Karlapalem, K., & Schneider, M. (2000). On efficient storage space distribution among materialized views and indices in data warehousing environments. In Acm cikm (pp. 397–404).

Bellatreche, L., & Wrembel, R. (2013). Special issue on: Evolution and versioning in semantic data integration systems. Journal of Data Semantics , 2 (2–3), 57–59. Retrieved from https://doi.org/10.1007/s13740-013-0020-6 .

Breß, S., Funke, H., & Teubner, J. (2016). Robust query processing in coprocessor-accelerated databases. In Acm sigmod (pp. 1891–1906).

Breß, S., Siegmund, N., Heimel, M., Saecker, M., Lauer, T., Bellatreche, L., & Saake, G. (2014). Load-aware inter-co-processor parallelism in database query processing. Data and Knowledge Engineering , 93 , 60–79.

Chaudhuri, S., & Narasayya, V.R. (1998). Autoadmin what-if index analysis utility. In Acm sigmod (pp. 367–378).

Herrmann, K., Voigt, H., Rausch, J., Behrend, A., & Lehner, W. (2018). Robust and simple database evolution. Information Systems Frontiers, 20(1). https://doi.org/10.1007/s10796-016-9730-2 .

Huang, S.-C., McIntosh, S., Sobolevsky, S., & Hung, P.C.K. (2017). Big data analytics and business intelligence in industry. Information Systems Frontiers , 19 , 1229. https://doi.org/10.1007/s10796-017-9804-9 .

Kunjir, M., Fain, B., Munagala, K., & Babu, S. (2017). ROBUS: fair cache allocation for data-parallel workloads. In Acm sigmod (pp. 219–234).

Lagos, N., Riga, M., Mitzias, P., Vion-Dury, J.-Y., Kontopoulos, E., Waddington, S., & Kompatsiaris, I. (2018). Dependency modelling for inconsistency management in digital preservation - the pericles approach. Infor-mation Systems Frontiers, 20(1). https://doi.org/10.1007/s10796-016-9709-z .

Markhoff, B., Nguyen, T.B., & Niang, C. (2017). When it comes to querying semantic cultural heritage data. In New trends in databases and infor-mation systems: Adbis 2017 short papers and workshops (pp. 384–394).

Morzy, T., Valduriez, P., & Bellatreche, L. (Eds.). (2015). Advances in databases and information systems , Vol. 9282. Berlin: Springer.

Ouared, A., Ouhammou, Y., & Bellatreche, L. (2016). Costdl: A cost models description language for performance metrics in database. In 21st international conference on engineering of complex computer systems, ICECCS (pp. 187–190).

Pavlo, A., Angulo, G., Arulraj, J., Lin, H., Lin, J., Ma, L., & Zhang, T. (2017). Self-driving database management systems. In CIDR .

Rahm, E., & Do, H.H. (2000). Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin , 23 (4), 3–13.

Google Scholar  

Roukh, A., Bellatreche, L., Bouarar, S., & Boukorca, A. (2017). Eco-physic: Eco-physical design initiative for very large databases. Information Systems Journal , 68 , 44–63.

Sartiani, C., Bidoit, N., Colazzo, D., & Malla, N. (2018). Evaluating queries and updates on big xml documents. Information Systems Frontiers, 20(1). https://doi.org/10.1007/s10796-017-9744-4 .

Toosi, A.N., Khodadadi, F., & Buyya, R. (2016). SipaaS: Spot instance pricing as a service framework and its implementation in openstack. Concurrency and Computation: Practice and Experience , 28 (13), 3672–3690.

Wojciechowski, A. (2018). Etl workflow reparation by means of case-based reasoning. Information Systems Frontiers, 20(1). https://doi.org/10.1007/s10796-016-9732-0 .

Wu, D., Sakr, S., & Zhu, L. (2017). Big data storage and data models. In Handbook of big data technologies (pp. 3–29).

Download references

Acknowledgments

We hope readers will find the content of this special section interesting and will inspire them to look further into the challenges that are still ahead before designing and exploiting advanced database and information systems.

The guest editors of this special section wish to express their sincere gratitude to all the authors who submitted their papers to this special section. We are also grateful to the Reviewing Committee for the hard work and the feedback provided to the authors. As guest editors of this special section, we also wish to express our gratitude to the Editors-in-Chief: Professor R. Ramesh and Professor H.R. Rao for the opportunity to edit this special section for our ADBIS conference, their assistance during the special section preparation, and for giving the authors the opportunity to publish their work in Information Systems Frontiers. We would like to mention that it is the first time that ADBIS conference organizes a special section for Information Systems Frontiers Journal, Springer, and this coincides with the first French organization of ADBIS in 2015 in Poitiers City . Last but not least, we wish to thank the Journal’s staff for their assistance and suggestions.

Author information

Authors and affiliations.

LIAS/ISAE-ENSMA - Poitiers University, Poitiers, France

Ladjel Bellatreche

INRIA and LIRMM, Montpellier, France

Patrick Valduriez

Institute of Computing Science, Poznan University of Technology, Poznan, Poland

Tadeusz Morzy

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ladjel Bellatreche .

Rights and permissions

Reprints and permissions

About this article

Bellatreche, L., Valduriez, P. & Morzy, T. Advances in Databases and Information Systems. Inf Syst Front 20 , 1–6 (2018). https://doi.org/10.1007/s10796-017-9819-2

Download citation

Published : 27 December 2017

Issue Date : February 2018

DOI : https://doi.org/10.1007/s10796-017-9819-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research

67 Data Management Essay Topics & Database Research Topics

🏆 best database research topics, ✍️ data management essay topics for college, 🎓 most interesting database topics for research paper, 💡 simple data management systems essay topics.

  • Database Management Systems’ Major Capabilities
  • Data Assets Management of LuLu Hypermarkets System
  • Relational Database Management Systems in Business
  • Big Data Opportunities in Green Supply Chain Management
  • Big Data Management Research
  • Electronic Health Record Database and Data Management
  • Object-Oriented and Database Management Systems Tradeoffs
  • Information Technology-Based Data Management in Retail The following paper discusses the specificities of data management and identifies the most apparent ethical considerations using retail as an example.
  • Why Open-Source Software Will (Or Will Not) Soon Dominate the Field of Database Management Tools The study aims at establishing whether open-source software will dominate the database field because there has been a changing trend in the business market.
  • Modern Data Management and Organization Strategies Today, with a shrinking focus on data and analytics, a proper data management strategy is imperative to meeting business goals.
  • Data Management in a Medium-Sized Business This paper will use a medium-sized business data management offering highly specialized, high-quality business development education services as an example.
  • EHR Database Management: Diabetes Prevention The data needed to prevent diabetes is usually collected throughout regular screenings conducted whenever a patient refers to a hospital, as well as by using various lab tests.
  • Data Management and Financial Strategies By adopting comprehensive supply chain management, businesses can maximize the three main streams in the supply chain— information flow, product flow, and money flow.
  • Policy on People Data Management Law No. (13) of 2016 is a data protection legislation that applies to all public institutions and private organizations across Qatar.
  • The Choice of a Medical Data Management System The choice of a medical data management system is critically important for any organization providing healthcare services.
  • Data Analytics and Its Application to Management The role of the collection of data and its subsequent analysis in the industry is as big as ever. Specifically, it pertains to the managerial field.
  • Technology-Assisted Reviews of Data in a Document Management System The TAR that is used in DMS falls into two major categories. These are automatic TAR and semi-automatic TAR, where the last implies the intervention of a human reviewer.
  • Data Collection and Management Techniques for a Qualitative Research Plan To conduct complete qualitative research and present a cohesive qualitative research plan, it is necessary to match the structure and topic of the study.
  • Database Management and Machine Learning Machine learning is used in science, business, industry, healthcare, education, etc. The possibilities of using machine learning technologies are constantly expanding.
  • Data Collection and Management Techniques of a Qualitative Research Plan This research paper recommends interview method in the collection of data and the application of NVivo statistical software in the management of data.
  • Big Data Fraud Management The growth of eCommerce systems has led to an increase in online transactions using credit cards and other methods of payment services.
  • Deli Depot Case Study: Data Analysis Management Reporting To improve customer service, Deli Depot has embarked on initiatives to better understand its customers. The company did market research using a questionnaire-based survey.
  • Data Storage Management Solutions: Losses of Personal Data The term data refers to a collection of facts about anything. As it is often said, processed data results to information and he who has information has power.
  • Data Management, Networking and Enterprise Software Enterprise software is often created “in-house” and thus has a far higher cost as compared to simply buying the software solution from another company.
  • Childhood Obesity: Data Management The use of electronic health records (EHR) is regarded as one of the effective ways to treat obesity in the population.
  • Health Data Management: Sharing and Saving Patient Data One of the ways to facilitate achieving the idealized environment of data sharing is developing the methods of accessing health-related information.
  • Big Data Usage in Supply Chain Management This paper gives a summary of the research that was conducted to understand the unique issues surrounding the use of big data in the supply chain.
  • Adopting Electronic Data Management in the Health Care Industry
  • Distributed Operating System and Infrastructure for Scientific Data Management
  • Advanced Drill Data Management Solutions Market: Growth and Forecast
  • The Changing Role of Data Management in Clinical Trials
  • Business Rules and Their Relationship to Effective Data Management
  • Class Enterprise Data Management and Administration
  • Developing Highly Scalable and Autonomic Data Management
  • Cloud Computing: Installation and Maintenance of Energy Efficient Data Management
  • Exploring, Mapping, and Data Management Integration of Habitable Environments in Astrobiology
  • Data Management: Data Warehousing and Data Mining
  • Efficient Algorithmic Techniques for Several Multidimensional Geometric Data Management and Analysis Problems
  • Data Management for Photovoltaic Power Plants Operation and Maintenance
  • Elderly Patients and Falls: Adverse Trends and Data Management
  • Data Management for Pre- and Post-Release Workforce Services
  • Epidemiological Data Management During an Outbreak of Ebola Virus Disease
  • Dealing With Identifier Variables in Data Management and Analysis
  • How Data Mining, Data Warehousing, and On-Line Transactional Databases Are Helping Solve the Data Management Predicament
  • Improving the New Data Management Technologies and Leverage
  • Integrated Process and Data Management for Healthcare Applications
  • Making Data Management Manageable: A Risk Assessment Activity for Managing Research Data
  • The Use of Temporal Database in the Data Management System
  • Multi-Cloud Data Management Using Shamir’s Secret Sharing and Quantum Byzantine Agreement Schemes
  • Data Management Is More Than Just Managing Data
  • Is Effective Data Management a Key Driver of Business Success?
  • National Data Centre and Financial Statistics Office: A Conceptual Design for Public Data Management
  • Big Data Management and Relevance of Big Data to E-Business
  • Redefining the Data Management Strategy: A Way to Leverage the Huge Chunk of Data
  • Structured Data Management Software Market in Taiwan
  • Towards Effective GML Data Management: Framework and Prototype
  • Data Management in Cloud Environments
  • Digital Communication: Enterprise Data Management
  • The Impact of Big Data on Data Management Functions
  • Analysis of Data Management Strategies at Tesco
  • The Best Data Management Tools Overview
  • What Is Data Management and Why Is It Important
  • Data Management and Use: Governance in the 21st Century
  • What Is Data Management and How Do Businesses Use It?
  • The Difference Between Data Management and Data Governance
  • Types of Data Management Systems for Data-First Marketing Strategies and Success
  • Reasons Why Data Management Leads to Business Success

Cite this post

  • Chicago (N-B)
  • Chicago (A-D)

StudyCorgi. (2022, June 5). 67 Data Management Essay Topics & Database Research Topics. https://studycorgi.com/ideas/data-management-essay-topics/

"67 Data Management Essay Topics & Database Research Topics." StudyCorgi , 5 June 2022, studycorgi.com/ideas/data-management-essay-topics/.

StudyCorgi . (2022) '67 Data Management Essay Topics & Database Research Topics'. 5 June.

1. StudyCorgi . "67 Data Management Essay Topics & Database Research Topics." June 5, 2022. https://studycorgi.com/ideas/data-management-essay-topics/.

Bibliography

StudyCorgi . "67 Data Management Essay Topics & Database Research Topics." June 5, 2022. https://studycorgi.com/ideas/data-management-essay-topics/.

StudyCorgi . 2022. "67 Data Management Essay Topics & Database Research Topics." June 5, 2022. https://studycorgi.com/ideas/data-management-essay-topics/.

These essay examples and topics on Data Management were carefully selected by the StudyCorgi editorial team. They meet our highest standards in terms of grammar, punctuation, style, and fact accuracy. Please ensure you properly reference the materials if you’re using them to write your assignment.

This essay topic collection was updated on December 27, 2023 .

Research Topics

Prof. Dr. Michael Grossniklaus

  • Barbara Lüthke
  • Researchers
  • Current Project Funding
  • Former Project Funding
  • Publications
  • Publication List
  • Minibase for Java
  • Student Corner
  • Projects and Theses
  • Internships

The steadily increasing informatization of society and economy produces data at a rate that has never been seen before. The volume and variety of available digital information continuously inspires new possibilities how insights can be gained by analyzing this data.

In order to realize this potential, numerous research efforts are already underway, which are typically summarized under the umbrella of data science. Data science is a field that crosscuts many research area of computer science, such as artificial intelligence, machine learning, data mining, databases, and information systems.

Our research falls into the last two of these areas and aims at supporting data science at the system level. Data science requires the management of new types of data as well as new complex ways to process it. Our research method is to address these requirements by innovating new and general solutions that leverage and extend core database and information systems technologies.

Within this broad area, our research focuses on challenges linked to data processing, in both traditional database and data stream management systems.

Graph Databases

We are currently investigating which data management technologies can be applied to what type of graph data application.

Network Data Analytics

We are interest in the analysis of large network datasets and in the detection of traits that are present among different types of networks.

Query Optimization

Phone: +49 7531 88 4434 Fax: +49 7531 88 3577

Room: PZ 806

Prof. Dr. Marc H. Scholl

Phone: +49 7531 88 4432 Fax: +49 7531 88 3577

Room: PZ 811

Search University of Konstanz

Suggestions.

CS 764 Topics in Database Management Systems

This course covers a number of advanced topics in the development of database management systems (DBMS) and the modern applications of databases. The topics discussed include query processing and optimization, advanced access methods, advanced concurrency control and recovery, parallel and distributed data systems, cloud computing for data platforms, and data processing with emerging hardware. The course material will be drawn from a number of papers in the database literature. We will cover one paper per lecture. All students are expected to read the paper before coming to the lecture.

Prerequisites: CS 564 or equivalent. If you have concerns about meeting the prerequisties, please contact the instructor.

  • Red Book : Readings in Database Systems (5th edition) - edited by Bailis, Hellerstein, and Stonebraker.
  • Cow Book : Database Management Systems (3rd edition) - by Raghu Ramakrishnan and Johannes Gehrke, McGraw Hill, 2003.

Lecture Format: Each lecture focuses on a classic or modern research paper. Students will read the paper and submit a review to https://wisc-cs764-f22.hotcrp.com before the lecture starts. Here is a sample review for the paper on join processing.

Course projects: A big component of this course is a research project. For the project, you pick a topic in the area of data management systems, and explore it in depth. Here are lists of suggested project topics created in 2020 , 2021 , and 2022 ; but you are encouraged to select a project outside of the list. The course project is a group project, and each group must be of size 2-4. Please start looking for project partners right away. The course project will include a project proposal, a short presentation at the end of the semester, and a final project report. Here are three sample projects from previous years ( sample1 , sample2 , sample3 ). The presentations are organized as a workshop. DAWN 2019 to have an idea of what it looks like. --> The project has the following deadlines:

  • Proposal due: Oct. 24
  • Presentation: Dec. 12 & 14
  • Paper submission: Dec. 19
  • CloudLab: https://www.cloudlab.us/signup.php?pid=NextGenDB (project name: NextGenDB)
  • Chameleon: https://www.chameleoncloud.org (project name: ngdb)
  • Paper review: 15%
  • Project proposal: 10%
  • Project presentation: 10%
  • Project final report: 30%

Information

  • Eugene Wu (Instructor) OH: TBA 421 Mudd
  • Class: Th 2-4PM
  • Syllabus & FAQ
  • Reviews Wiki
  • Req: W4111 Intro to DB
  • Pref: W4112 DB Impl
  • Ugrads OK; see Prof Wu
  • Proposal 5%
  • Paper Draft 10%
  • Demo/Poster 10%
  • Participation 10% <!–
  • Paper Reviews 10%
  • Assignments 15% –>

Data management systems are the corner-stone of modern applications, businesses, and science (including data). If you were excited by the topics in 4111, this graduate level course in database systems research will be a deep dive into classic and modern database systems research. Topics will range from classic database system design, modern optimizations in single-machine and multi-machine settings, data cleaning and quality, and application-oriented databases. This semester’s theme will look at how learning has affected many classic data management systems challenges, and also how data management systems support and extends ML needs.

See FAQ for difference between 6113 and the other database courses.

  • Class: Th 2-4PM in 829 Mudd
  • Instructor: Eugene Wu , OH: Thurs 12-1PM 421 Mudd
  • Syllabus & FAQ , Slack , Project , Papers
  • Prereqs: W4111 Intro to DB (required), W4112 DB Implementations (recommended). Ugrads OK; see Prof Wu
  • Discussion Prep 30%
  • Class participation 30%
  • Project 40%: Final Presentation 10% , Paper 30%

Recent Announcements

Tentative schedule.

Course design inspired by

  • Cal’s CS286
  • Waterloo’s CS848
  • Colin Raffel’s Role playing seminar
  • Carl Vondrick’s self supervision graduate seminar

iNetTutor.com

Online Programming Lessons, Tutorials and Capstone Project guide

40 List of DBMS Project Topics and Ideas

Introduction

A Capstone project is the last project of an IT degree program. It is made up of one or more research projects in which students create prototypes, services, and/or products. The projects are organized around an issue that needs to be handled in real-world scenarios. When IT departments want to test new ideas or concepts that will be adopted into their daily operations, they implement these capstone projects within their services.

In this article, our team has compiled a list of Database Management System Project Topics and Ideas. The capstone projects listed below will assist future researchers in deciding which capstone project idea to pursue. Future researchers may find the information in this page useful in coming up with unique capstone project ideas.

  • Telemedicine Online Platform Database Design

  “Telemedicine Online Platform” is designed to allow doctors to deliver clinical support to patients remotely. Doctors can communicate with their patients in real-time for consultations, diagnoses, monitoring, and medical supply prescriptions. The project will be developed using the SDLC method by the researchers. The researchers will also compile a sample of hospital doctors and patients who will act as study participants. A panel of IT specialists will review, test, and assess the project.

  • Virtual and Remote Guidance Counselling System Database Design

Counseling is a vital component of a person’s life since it aids in the improvement of interpersonal relationships. Humans must cease ignoring this issue because it is essential for the development of mental wellness. The capstone project “Virtual and Remote Guidance Counselling System,” which covers the gap in giving counseling in stressful situations, was built for this reason. It answers to the requirement to fill in the gaps in the traditional technique and make it more effective and immersive in this way.

Virtual and Remote Guidance Counselling System Database Design - Relationship

  • COVID-19 Facilities Management Information System Database Design

COVID – 19 has put people in fear due to its capability of transmission when exposed to the virus. The health sectors and the government provide isolation facilities for COVID-19 patients to mitigate the spread and transmission of the virus. However, proper communication for the availability of the facilities is inefficient resulting to surge of patients in just one facility and some are transferred multiple times due to unavailability. The COVID-19 respondents must have an advance tools to manage the COVID-19 facilities where respondents can easily look for available facilities to cater more patients.

  • Document Tracking System Database Design

The capstone project, “Document Tracking System” is purposely designed for companies and organizations that allow them to electronically store and track documents. The system will track the in/out of the documents across different departments. The typical way of tracking documents is done using the manual approach. The staff will call or personally ask for updates about the documents which are time-consuming and inefficient.

  • Face Recognition Application Database Design

Technology has grown so fast; it changes the way we do our daily tasks. Technology has made our daily lives easier. The capstone project, entitled “Face Recognition Attendance System” is designed to automate checking and recording of students’ attendance during school events using face recognition technology. The system will work by storing the student’s information along with their photographs in a server and the system will detect the faces of the students during school events and match it and verify to record the presence or absence of the student.

Face Recognition Application Database Design - List of Tables

  • Digital Wallet Solution Database Design

The capstone project, named “Digital Wallet Solution,” is intended to allow people to store money online and make payments online. The digital wallet transactions accept a variety of currencies and provide a variety of payment gateways via which the user can pay for products and services. The system allows users to conduct secure and convenient online financial transactions. It will speed up payment and other financial processes, reducing the amount of time and effort required to complete them.

  • Virtual Online Tour Application Database Design

The usage of technology is an advantage in the business industry, especially during this challenging pandemic. It allows businesses to continue to operate beyond physicality. The capstone project entitled “Virtual Online Tour Application” is designed as a platform to streamline virtual tours for clients. Any business industry can use the system to accommodate and provide their clients with a virtual experience of their business. For example, the tourist industry and real estate agencies can use the system to provide a virtual tour to their clients about the tourist locations and designs of properties, respectively.

  • Invoice Management System Database Design

The researchers will create a system that will make it easier for companies to manage and keep track of their invoice information. The company’s sales records, payables, and total invoice records will all be electronically managed using this project. Technology is highly used for business operations and transactions automation. The capstone project, entitled “Invoice Management System” is designed to automate the management of the company’s invoice records. The said project will help companies to have an organized, accurate, and reliable record that will help them track their sales and finances.

Invoice Management System Database Design - List of Tables

  • Vehicle Repair and Maintenance Management System Database Design

Information Technology has become an integral part of any kind of business in terms of automating business operations and transactions. The capstone project, entitled “Vehicle Repair and Maintenance Management System” is designed for vehicle repair and maintenance management automation. The said project will automate the vehicle garage’s operations and daily transactions. The system will automate operations such as managing vehicle repair and maintenance records, invoice records, customer records, transaction records, billing and payment records, and transaction records.

  • Transcribe Medical Database Design

Information technology has made everything easier and simpler, including transcribing the medical diagnosis of patients. The capstone project, entitled “Medical Transcription Platform,” is designed to allow medical transcriptionists to transcribe audio of medical consultations and diagnose patients in a centralized manner. A medical transcriptionist is vital to keep accurate and credible medical records of patients and can be used by other doctors to know the patients’ medical history. The said project will serve as a platform where transcribed medical audios are stored for safekeeping and easy retrieval.

  • Multi-branch Travel Agency and Booking System Database Design

The capstone project, entitled “Multi-Branch Travel Agency and Booking System,” is designed as a centralized platform wherein multiple travel agency branches are registered to ease and simplify inquiries and booking of travels and tour packages by clients. The said project will allow travel agencies to operate a business in an easy, fast manner considering the convenience and safety of their clients. The system will enable travel agencies and their clients to have a seamless online transaction.

  • Pharmacy Stocks Management Database Design

The capstone project “Pharmacy Stocks Management System” allows pharmacies to manage and monitor their stocks of drugs electronically. The Pharmacy Stocks Management System will automate inventory to help ensure that the pharmacy has enough stock of medications and supplies to serve the needs of the patients.

  • Loan Management with SMS Database Design

The capstone project entitled “ Loan Management System with SMS ” is an online platform that allows members to apply and request loan. In addition, they can also monitor their balance in their respective dashboard. Management of cooperative will review first the application for approval or disapproval of the request. Notification will be send through the SMS or short messaging service feature of the system.

Loan Management System with SMS Database Design - List of Tables

  • Service Call Management System Database Design

The capstone project, entitled ” Service Call Management System,” is designed to transform service calls to a centralized platform. The said project would allow clients to log in and lodge calls to the tech support if they encountered issues and difficulties with their purchased products. The tech support team will diagnose the issue and provide them with the necessary actions to perform via a call to solve the problem and achieve satisfaction.

  • File Management with Approval Process Database Design

The File Management System provides a platform for submitting, approving, storing, and retrieving files. Specifically, the capstone project is for the file management of various business organizations. This is quite beneficial in the management and organization of the files of every department. Installation of the system on an intranet is possible, as is uploading the system to a live server, from which the platform can be viewed online and through the use of a browser.

  • Beauty Parlor Management System Database Design

The capstone project entitled “Beauty Parlour Management System” is an example of transactional processing system that focuses on the records and process of a beauty parlour. This online application will help the management to keep and manage their transactions in an organize, fast and efficient manner.

  • Exam Management System Database Design

Information technology plays a significant role in the teaching and learning process of teachers and students, respectively. IT offers a more efficient and convenient way for teachers and students to learn and assess learnings. The capstone project, “Exam Management System,” is designed to allow electronic management of all the information about the exam questions, courses and subjects, and teachers and students. The said project is an all-in-one platform for student exam management.

Exam Management System Database Design - List of Tables

  • Student and Faculty Clearance Database Design

The capstone project, entitled “Student and Faculty Clearance System,” is designed to automate students and faculty clearance processes. The approach is intended to make the clearance procedure easier while also guaranteeing that approvals are accurate and complete. The project works by giving every Department involved access to the application. The proposed scheme can eliminate the specified challenges, streamline the process, and verify the integrity and correctness of the data.

  • Vehicle Parking Management System Database Design

The capstone project entitled “ Vehicle Parking Management System ” is an online platform that allows vehicle owners to request or reserve a slot for parking space. Management can accept and decline the request of reservation. In addition, payment option is also part of the system feature but is limited to on-site payment.

  • Hospital Resources and Room Utilization Database Design

The capstone project, “Hospital Resources and Room Utilization Management System” is a system designed to streamline the process of managing hospital resources and room utilization. The said project is critical especially now that we are facing a pandemic, there is a need for efficient management of hospital resources and room management. The management efficiency will prevent a shortage in supplies and overcrowding of patients in the hospitals.

Hospital Resources and Room Utilization Database Design

  • Church Event Management System Database Design

The capstone project entitled “Church Event Management System” is designed to be used by church organizations in creating and managing different church events. The conventional method of managing church events is done manually where members of organizations will face difficulties due to physical barriers and time constraints.

  • CrowdFunding Platform Database Design

Business financing is critical for new business ventures. In this study, the researchers concentrate on designing and developing a business financing platform that is effective for new startups. This capstone project, entitled “Crowdfunding Platform” is a website that allows entrepreneurs to campaign their new business venture to attract investors and crowdfund.

  • Vehicle Franchising and Drivers Offense Software Database Design

The proposed software will be used to electronically process and manage vehicle and franchising and driver’s offenses. The proposed software will eliminate the manual method which involves a lot of paper works and consumes valuable amount of time. The proposed project will serve as a centralized platform was recording and paying for the offenses committed by the drivers will be processed. The system will quicken the process of completing transaction between the enforcers and the drivers. Vehicle franchising and managing driver offenses will be easy, fast and convenient using the system.

  • Student Tracking Performance Database Design

The capstone project entitled “Student Academic Performance Tracking and Monitoring System” allows academic institutions to monitor and gather data about the academic performance of students where decisions are derived to further improve the students learning outcomes. Tracking and monitoring student’s performance serves a vital role in providing information that is used to assist students, teachers, administrators, and policymakers in making decisions that will further improve the academic performance of students.

  • Webinar Course Management System Database Design

The capstone project, entitled “Webinar Course Management System,” is designed to automate managing webinar courses. The project aims to eliminate the current method, which is inefficient and inconvenient for parties involved in the webinar. A software development life cycle (SDLC) technique will be used by the researchers in order to build this project. They will gather a sample size of participating webinar members and facilitators to serve as respondents of the study.

  • Online Birth Certificate Processing System with SMS Notification Database Design

The capstone project, “Online Birth Certificate Processing System with SMS Notification “ is an IT-based solution that aims to automate the process of requesting, verifying, and approving inquiries for original birth records. The system will eliminate the traditional method and transition the birth certificate processing into an easy, convenient, and efficient manner. The researchers will develop the project following the Software Development Life Cycle (SDLC) technique.

  • Food Donation Services Database Design

Information technology plays a significant role in automating the operations of many companies to boost efficiency. One of these is the automation of food donation and distribution management. “Food Donation Services,” the capstone project, is intended to serve as a platform for facilitating transactions between food groups, donors, and recipients. Food banks will be able to respond to various food donations and food assistance requests in a timely and effective manner as a result of the project.

  • COVID Profiling Database Design

The capstone project “City COVID-19 Profiling System with Decision Support” is designed to automate the process of profiling COVID-19 patients. The project will empower local health officers in electronically recording and managing COVID-19 patient information such as symptoms, travel history, and other critical details needed to identify patients. Manual profiling is prone to human mistakes, necessitates a lot of paperwork, and needs too much time and effort from the employees.

  • Evacuation Center Database Design

Calamities can have a significant impact on society. It may result in an enormous number of people being evacuated. The local government unit assigned evacuation centers to provide temporary shelter for people during disasters. Evacuation centers are provided to give temporary shelter for the people during and after a calamity. Evacuation centers can be churches, sports stadium community centers, and much more that are capable to provide emergency shelter.

  • QR Code Fare Payment System Database Design

The capstone project, “QR Code Fare Payment System” is designed to automate the procedure of paying for a fare when riding a vehicle. Passengers will register in the system to receive their own QR code, which they will use to pay for their fares by scanning in the system’s QR code scanning page. The project will enable cashless fare payment.

  • Web Based Psychopathology Diagnosis System Database Design

The capstone project entitled “Web-Based Psychopathology Diagnosis System” is designed for patients and medical staff in the field of psychopathology. The system will be a centralized platform to be used by patients and psychopathologists for consultations. The said project will also keep all the records electronically. Mental health is important. Each individual must give importance to their mental health by paying attention to it and seek medical advice if symptoms of mental disorders and unusual behavior occur.

  • Service Marketplace System Database Design

The capstone project, “Services Marketplace System” is designed to serve as a centralized platform for marketing and inquiring about different services. The system will serve as a platform where different service providers and customers will have an automated transaction. Technology made it easier for people to accomplish daily tasks and activities. In the conventional method, customers avail themselves of services by visiting the shop that offers their desired services personally.

40 List of DBMS Project Topics and Ideas

  • Fish Catch System Database Design

The capstone project, entitled “Fish Catch Monitoring System” will automate the process of recording and monitoring fish catches. The said project is intended to be used by fisherman and fish markets to accurately record fish catches and will also keep the records electronically safe and secure.

  • Complaints Handling Management System Free Template Database Design

The capstone project, “Complaint Handling Management System” is a system designed to help educational institutions to handle and manage complaints electronically. The system will improve the response time of the school’s management in addressing the complaints of the students, parents, staff, and other stakeholders.

  • Senior Citizen Information System Free Template Database Design

The system will replace the manual method of managing information and records of the senior citizen to an electronic one. The system will serve as a repository of the record of the senior citizen within the scope of a specific local government unit. By using the system, paper works will be lessened and human errors in file handling will be avoided. The system is efficient enough to aid in managing and keeping the records of the senior citizens in the different barangay.

  • Online and SMS-Based Salary Notification Database Design

The “Online and SMS Based Salary Notification” is a capstone project intended to be used by companies and employees to automate the process of notifying salary details. The application will work by allowing the designated company encoder to encode details of salary and the employees to log in to his/her account in the application and have access to the details of his/her salary. One of the beauties of being employed is being paid. Employers manage the employee’s salary and are responsible to discuss with the employees the system of the salary and deductions.

  • Maternal Records Management Database Design

The capstone project, “Maternal Records Management System” is a system that automates the process of recording and keeping maternal records. The said project will allow maternity clinics to track and monitor their patients’ records from pregnancy to their baby’s immunization records.

  • Online Complaint Management System Database Design

Online Complaint Management System is a capstone project that is design to serve as a platform to address complaints and resolve disputes. The system provides an online way of resolving problems faced by the public or people within the organization. The system will make complaints easier to coordinate, monitor, track and resolve.

  • Online Donation Database Design

The capstone project ,  “Online Donation Platform for DSWD” is an online platform for giving and asking donations in the Department of Social Welfare and Development (DSWD). The system will be managed by the staffs of the DSWD to verify donors and legible beneficiaries electronically. The system will have an SMS feature to notify the donors and beneficiaries about the status of their request.

  • OJT Timesheet Monitoring System using QR Code Database Design

The capstone project, “OJT Timesheet Monitoring System using QR Code” allows employer to automate timesheet of each trainee for easy monitoring. The system will be used by the on-the-job trainees to serve as their daily time in and out using the QR code generated by the system. The entire system will be managed by the administrator.

Technology is attributed with driving change in a wide range of enterprises and institutions. Because of information technology, the world has altered dramatically. It is difficult to imagine an industry or organization that has not benefited from technology advances. In these businesses, the most common role of IT has been to automate numerous procedures and transactions in order to increase efficiency and improve people’s overall experience and satisfaction. The aforementioned capstone project ideas will be useful in a range of sectors. It will aid in enhancing operational efficiency as well as the services provided to the project’s users.

You may visit our  Facebook page for more information, inquiries, and comments. Please subscribe also to our YouTube Channel to receive  free capstone projects resources and computer programming tutorials.

Hire our team to do the project.

Related Topics and Articles:

  • List of Completed Capstone Projects with Source code
  • 27 Free Capstone Project Ideas and Tutorials
  • 16 Lists of Free Capstone Project Ideas in Flutter
  • 39 Capstone Project Ideas for IT Related Courses
  • 50+ Free Download Web Based System Template in Bootstrap
  • COVID-19 Capstone and Research Free Project Ideas 2022
  • Capstone Project Ideas for IT and IS January 2022
  • Capstone Project Ideas for IT and IS December 2021
  • IT and IS Capstone Project Free Resources November 2021
  • List of 45 IT Capstone Project on Crime and Disaster Management

Post navigation

  • QR Code Generator in PHP Free Source code and Tutorial

Similar Articles

Backup mysql database on wamp, mysql tutorial – creating a table in mysql.

Taxi Mobile an Android Based Taxi Booking Application

Taxi Mobile an Android Based Taxi Booking Application

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Front Physiol

Current Trends and New Challenges of Databases and Web Applications for Systems Driven Biological Research

Pradeep kumar sreenivasaiah.

1 Systems Biology Research Center and College of Life Science, Gwangju Institute of Science and Technology, Gwangju, Republic of Korea

Dynamic and rapidly evolving nature of systems driven research imposes special requirements on the technology, approach, design and architecture of computational infrastructure including database and Web application. Several solutions have been proposed to meet the expectations and novel methods have been developed to address the persisting problems of data integration. It is important for researchers to understand different technologies and approaches. Having familiarized with the pros and cons of the existing technologies, researchers can exploit its capabilities to the maximum potential for integrating data. In this review we discuss the architecture, design and key technologies underlying some of the prominent databases and Web applications. We will mention their roles in integration of biological data and investigate some of the emerging design concepts and computational technologies that are likely to have a key role in the future of systems driven biomedical research.

Introduction

Nobel Laurate Ivan P. Pavlov tried to understand basic animal physiology by methodically planning surgical experiments, which he believed could advance knowledge in humans. His main contributions were in studying neuronal input to the stomach and pancreas triggering secretions of acid and digestive enzymes with the anticipation of the ingestion of a desirable food. This cephalic phase experiment by Pavlov was brilliant demonstration of systems approach studying interaction of multiple subsystems like brain and gut, even though investigating techniques used by Pavlov are now considered as conventional approach. The Nobel Prize, awarded to Pavlov was the first ever awarded for the studies in integrated systems physiology (Wood, 2004 ).

We want to emphasize that the nature of systems physiology, from the beginning, has been interdisciplinary. Systems physiology promotes the sense that biological components are not merely an isolated entity, but, on the contrary, is part of highly interconnected coherently functioning dynamic network. The field particularly concerns with recognizing the importance of interactions between biological components and the consequences of those interactions. Thus systems physiology embody holistic views as to how molecules, pathways and networks interact to establish a functioning system at different levels of organization from molecules, organelles, cells, tissues to organs, and even to entire organisms, and further and how malformations in these system leads to diseases? So with all the knowledge in hand we can step forward to develop detailed computational model of human body.

Progress in systems driven research [e.g., systems biology, physiome, systems physiology systems pharmacology, virtual physiological human (VPH), personal health systems, life science e-infrastructures] is significantly driven by development of suitable computational infrastructure including tools and information resources. Over the past few years a variety of high- throughput methodologies (∼omics) were developed that has enabled large-scale studies of the biological components at different organizational levels and various scales: genome; interactome; cellular function; tissue and whole-organ structure–function relationship; and integrative functions of the whole organism (e.g behavior and consciousness) to name the few. This has generated massive amounts of data about biological components in multiple sets of experimental conditions. Mostly the contributions are from laboratories around the world following proprietary standards, techniques and methods. Systems biologists seek to integrate and interpret such massive amounts of highly heterogeneous information to understand how biological system functions. Issues and challenges in data integration problems has been meticulously addressed by development of several biological data standardization initiatives (e.g., SBML, Finney and Hucka, 2003 ; insilicoML, Asai et al., 2008 ; and CellML, Lloyd et al., 2004 ), ontologies (e.g., GO, Ashburner et al., 2000 ; SBO, Le Novere, 2006 ; and BioPAX, 2006 ), and establishing large software infrastructures and tools (e.g., NCBI, Sayers et al., 2010 ; EBI, Brooksbank et al., 2005 ; Bioconductor, Gentleman et al., 2004 ; eScience, Hey and Trefethen, 2003 ). This progress in the field of biomedical data integration has resulted in development of a good methodological and technological framework and much of this has happened just in last decade. But like others, we also believe that cornucopia of the best practices is still evolving and major developments in data integration are underway. The anticipation is, once established such computational infrastructure will enable collaborative investigation of complex biological systems and will help to tackle challenges underlying systems physiology research.

The objective of this review paper will be to understand the major data integration challenges. We will discuss about the kinds of integration approaches and technologies that have been tried to meet the challenges. The progress, pros and cons of the major technologies supporting systems research will be reflected in the substance of our discussions. We will also discuss the future technologies and new challenges that are anticipated and might help progress of systems physiology. We have compiled a glossary of terms and list of useful databases and Web applications and made it available as supplementary online web pages, which is accessible at this Web address: http://cidms.org/systems_research/resource.html.

Overview of Problems In Databases and Web Applications in Integrating Information for Systems Scale Analysis

Systems biological research is dynamically evolving in a rapid pace. Research in systems biology is becoming more sophisticated in terms of the capabilities expected from the databases and Web applications. Attributed to the very complex nature of the biological systems, enumerating every requirement systematically, for developing computing infrastructure, is highly difficult. To design and build such computational systems involves usage of numerous standards, technologies, frameworks and tool kits which are complex, increasingly expensive to build and maintain, and requires meticulous planning and management.

In our view those critical issues associated with databases and Web applications supporting systems research are listed in BOX1. Current emerging data integration approaches and technologies should address these issues in order to facilitate continued progress in systems research.

Box 1. A summary of critical issues

Data availability: Data availability deals with the issues associated with accessing data in public and private setup which by large is influenced by the institutional policy differences.

Data quantity: Systems research is an iterative and data intensive. Current data will give rise to new information and models and inturn will result in more new data with variations. This cycle continues, and data volume increases exponentially. Therefore management of data quantity is crucial for systems driven research.

Data quality : Data quality describes a set of data properties describing their ability to satisfy user's expectations or requirements concerning data usage for acquiring information in a given area of interest, learning and decision making. Databases should institute quality check measures to ensure that the data they provide to the research community is of high quality. It is much easier to enforce quality measures in a closed setup, but it is a major problem to be addressed in a social collaborative environment (e.g., Wikipedia, 2010 1 ). Poor quality data may contain incomplete or missing fields. The data may be represented in a non-standard/legacy formats that will create problems for data and information integration. To ensure high quality, after data is received, databases should then use their own quality measures which may also include manual curation by domain experts. There should be standardized mechanism to ensure consistency and completeness of the submission. Usage of Semantic aware forms (ontology guided forms discussed later) for data procurement, on-the-spot data entry field validations using advance Web scripts may minimize proliferation of inaccurate data.

Data access: Systems researcher often works with diverse set of data usually from different biological levels of organization (molecular, cellular, organism etc). Computational frameworks that will serve to store data and allow data access by query is needed. These computational data framework could give accesses to data by accumulating into one central repository or just through a uniform interface which gives accesses to multiple heterogeneous databases, geographically separated, hosting their own data. With presence of multiple heterogeneous data sources, querying and extracting data will be a problem. Ideal expectation will be a single query to fetch the information spanning several sources. Taking one step further linking biological entities to each other in a meaningfully related manner, enhancing interoperability can be realized by embedding Semantic awareness into the framework. This could enhance the query capacities. Past the query, researchers can retrieve data, compare and analyze until the desired endpoint is attained. This step could be facilitated if analysis and visualization tools are built into the integrated computational framework, allowing users to specify and carry out in silico experiments, record intermediate and final results and annotate experiments.

Data visualization: Visualization of raw and modeled data is an important tool for analyzing and interpreting the complex and interconnected data. Visualizing data as pathway and networks has helped researchers to record and communicate their findings. The irony is, in systems research, visualizing deluge of data user may be overwhelmed by it, rather than reaping any benefit at all.

The matter turns to worse when visualization needs to support user interactivity (e.g., in case of assembly, curation and modeling of complex models), and particularly when done in a collaborative setup. So the Web applications and tools supporting visualization should use niche techniques to present the data at the right level of detail, in a cohesive, insightful manner (Gehlenborg et al., 2010 ). Various way of visually representing the same knowledge breaches effective communication between different biological communities. It takes more effort on biologists to familiarize themselves with different notations. Only recently graphical representation standard for biology, SBGN (Le Novère et al., 2009 ), has been proposed. However only miniscule of tools exist that has incorporated SBGN (e.g., Cell Designer, Kitano et al., 2005 ; and SBGN-ED, Czauderna et al., 2010 ). SBGN awaits adoption by biological tools, databases and Web application. One major problem for SBGN adoption is none of the Web browsers support rendering graphics written in SBGN. Alternatively, Implementing SBGN graphic notations specification in Scalable Vector Graphics (SVG) will be an alternate innovative solution for databases and Web applications (e.g., as proposed by us in CIDMS-PD: Cardiac Integrated Database Management System-Pathway Database, 2009 ). SVG is a generic graphical representational format has been already widely adopted by internet community (Scalable Vector Graphics, 2010 2 ). It is supported and rendered by most of the Web browsers. Community has to yet pick up this idea, and act to develop practical visualization applications.

Data representation and standards: Collaborative nature of systems research place emphasis on conforming to standards and data formats; for searching, information exchange and mutual understanding. Standards can be developed informally among group of researchers or it could be enforced by journals and funding organizations or even by used software tool. It will help to link data and tools into an integrative framework. Standards can also help to avoid misunderstanding and duplication of work. But this will only take off if the community at large can reach consensus on using handful of them. Using agreed set of standards and data formats increases processing efficiency in a large-scale integrative computational framework environment as it minimizes unnecessary, inefficient conversions between standards. Data in the databases and on the Web is a mix of structured and unstructured formats. The representation mechanism is usually simple and diverse. So accessing by machines becomes a fundamental problem.

Security: Communication of data between application systems must ensure security to avoid improper access. Trust or the lack thereof, is the most essential factor blocking the adoption of rapidly evolving Web technology paradigm such as software as service (SaaS explained later) and data distribution services. This issue is usually addressed by the database management systems (DBMS) or framework which has mechanisms to handle several security attributes like multi-tenant (clients sharing vital data with servers), data access check and levels of security clearance based on the roles (e.g., admin, general user, curator), data sharing with other organization or participant and keeping the vital data safe from prying eyes.

Version control (VC): Set of mechanisms that support evolution of developed artifacts (e.g., source, analysis and design documents, data, and models) in computer application. VC helps to trackback and in data provenance. Explicitly exposing version metadata to the clients can aid in reinforcing data quality. This issue is usually addressed by the DBMS.

Interoperability: The interoperability issues are the problems that are associated with bringing together heterogeneous and distributed information systems. Today research on interoperability solutions has moved technology from having a single monolithic expensive solution into distributed collaborative inexpensive solutions. With such a trend; often arise the problems of Semantic heterogeneity, data integrity, data representation and data migration, and correctness of the interpretation of data sets obtained from different resources. Semantic heterogeneity, for example deals with the conflict due to multiple names for the same concept used in different resources or it could be as a result of multiple interpretations for the same name. Machines are expected to be told explicitly about such disparities which seem to be very intuitive to humans. Interoperability issues can be attributed to systems heterogeneities occurring at different levels including between softwares, interoperability between analytical methodologies, among data and databases. These problems can be a serious predicament during data integration, analysis and discovering knowledge.

Computationally intensive: Modeling the living organism is a complex task this fact is a reflection of the inherent complexity of biological systems itself. Modeling workflow typically include: defining the problem scope and drawing boundaries in accordance to what questions are needed to be answered; integrating large and diverse data; using integrated data and formalize the problem as model(s) with a machine-readable language; executing the model on the computational infrastructure (including software and hardware); validate, analyze and visualize the results. Varying amount of computing infrastructure is required in every step of the modeling workflow and this requirement increases with more and more complex model (egg. incorporating finer spatial and temporal resolutions into model increases the complexity) (Burrage et al., 2006 ). All aspects of biological system executes in parallel however computing is sequential. Emulating the parallel processing ability of biological system is a difficult task and not feasible with conventional hardware architectures and existing softwares (Mazza, 2010 ). They require architecture based on scalable parallel and distributed systems (e.g., grid and Cloud computing, discussed later). Further, to exploit capacities and capabilities of parallel architecture requires advanced software designing methodologies; make difficult paradigm shifts in programming techniques; and implementation of sophisticated algorithm.

Issues with development and distribution of tools: To do meaningful analysis with all the data from various resources requires appropriate tools and methodologies. Typically tools are written with a specific set of requirements and contexts. Systems approach in biology is a rapidly developing field where the pace of data production and progress in methodology is rapid and often there is requirement for new resources. To deal with this situation often there are not many tools available. New tools and standards have to be made or modify the existing ones.

Using domain standards supports interoperability and reuse. Standardization strictly focuses on the most essential and commonalities, but compromises on the variations. In systems driven research if standards are enforced stringently there is a risk that novel findings may be missed. Thus the characteristics of the systems driven research field require adoption of best engineering practices which facilitates development of customized computational infrastructure. The challenge of software customization has been partly met by using off-the-shelf software components that were developed by scientific community during ~omic/post-omic period by several organizations. These readily available software components provide much of the functionalities and capabilities required, which inturn can be chained together in a workflow to achieve much bigger objectives, rather than reinventing the wheel. However, software development using off-the-shelf tools pose several of its own problems than building a system from scratch or scaling a system by re-using components built internally in an organization. The reason is off-the-shelf tools in the first place were meant to run as a standalone application; they have no mechanism for interacting with other programs. It is extremely important, now than ever, the need for collective efforts in the community toward development of infrastructure by creating open-source reusable libraries and toolkits. Coverage of such initiative should not just limit to developing software components but should also be extended for the data, algorithms and analysis methodologies. Although most of these are known issues, the community initiatives to rectify them are progressing slowly because of political, funding and intellectual property reasons.

Approaches, Technologies, Architecture and Design Strategies for Data Integration

Systems research is highly interdisciplinary and involves meaningful interpretation of data from high-throughput experiments through building multiscale models. There is a continuous need to integrate existing technologies with newly developed and emerging technologies. In this section we discuss systems research driven design and developmental strategies undertaken toward data integration by building databases and Web application infrastructures. Further we will discuss how such an information infrastructure can allow disparate research groups to access integrated data sources, reuse tools and methodologies that help cross-collaboration in generating data and models. Our main focus here would be to summarize database and Web application technologies.

The approaches to integrate data can use centralized model or distributed model (Sheth and Larson, 1990 ). In centralized model there is one unified schema, a massive central repository (e.g., warehouse), which is framed based on the schemas of the individual data sources (Reddy et al., 1994 ). The data transferred to central repository is collected, integrated, stored and made available for search and presentation. (e.g., Biowarehouse, Lee et al., 2006 ; Atlas, Shah et al., 2005 ). The distributed model includes federation and mediation approaches. In federation approach the data is left in the respective fully functional expert databases maintaining data autonomy while still providing integrated access to distributed data (e.g., Entrez, Sayers et al., 2010 ; Biomart, Haider et al., 2009 ; DAS, Jenkinson et al., 2008 ; EBI, Brooksbank et al., 2005 ). Here integration expects no data transfer to any one central repository. The design relies on an agreed data exchange protocol between the participating databases. A central hub undertakes responsibility of coordinating and organizing the queries across databases and data retrieval is powered by each databases. Mediation does not store any data on its own rather it provides virtual view of the integrated sources (e.g., DiscoveryLink, Haas et al., 2001 ).

All the integration solutions till date can be grouped in to three different technological layers: Data layer centric solutions, middle/object layer centric solutions and application layer centric solutions. The data layer centric solutions involve databases and DBMS in the form of data warehouse, multi-databases, distributed databases or federated databases. The middle/object layer centric solutions mainly support distributed applications. Many technologies that belongs to object/middle layer centric solutions are related to middle ware development. Middleware is typically software that resides between a data store on one side and applications on the other where data is collected or processed further. The object based approach is fixated on use of interoperable standard objects. Examples of technologies belonging to this category are CORBA (Common Object Request Broker Architecture), SOAP (Simple Object Access protocol), SOA Service-oriented Architecture, DCOM (Distributed Component Object Model), Representational State Transfer (REST) and Java (EJB, RMI). In the application layer centric solutions applications takes responsibility to integrate data. Link integration, view integration and Web services (a variant of link integration refer to review by Stein, 2003 ) are approaches that belong to this layer. The projects utilizing application layer centric solution have used centralized model, distributed model and some have even used a hybrid of the two models (e.g., CIDMS, 2007 ; ApiDB, Wang et al., 2007 ), which eliminates several disadvantages (e.g., helps to improve performance) posed by either of the models.

Centralized databases

One of the early popular data integration technique using centralized model was by providing unified interface to heterogeneous data sources (e.g., SRS, Sequence retrieval system; Zdobnov et al., 2002 ), List of publicly accessible SRS servers (Biowisdom, 2010 ). Central repositories are created locally by full-text indexing the data present mostly in the form of flat-file/XML format (locally mirrored data sources). Some of these repositories even allow seamless integration with numerous bioinformatics analysis tools. The users can use keywords, identifiers like accession and symbols to search and navigate through data contained in various databases regardless of their format; query them in same way, at the same time and capture results. One of the variation of the above approach is mining descriptors (representative data) from various databases based on some predefined criteria (e.g., Gene Cards database organizes one file per human gene (Safran et al., 2002 ), organize them as files which then will be collectively indexed and enabled to full-text search. The descriptors obtained from the sources will contain only the most essential information and hyperlinks to original source. User will be presented a collective summary from multiple sources in a single information space with search and data filtering capabilities. The drawbacks are that it is incapable of supporting searches based on Semantics that uses hierarchically structured information (Ontologies).

One of the widely used technologies based on centralized model is Data Warehouse (e.g., Biowarehouse, Lee et al., 2006 ; GUS, Clark et al., 2005 and Atlas Shah et al., 2005 ). In this approach data is imported from all remote sources via special scripts/programs called loaders into one single local database. The loader is a piece of software that helps in conversion of data in a different format to the required format. The loaders can also be designed to apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity (discussed later). The information imported from various databases is collected and organized in a unified data model (Lee et al., 2006 ). This way of data integration provides a single access point to a collection of all data with capabilities of answering the questions; not just those of which individual source database could have answered but also answers to other complex questions that requires integrated information which none of individual source database could have provided. Other key benefits include good performance and improved data consistency. Among the major problems with this approach include keeping information up to date (data synchronization), scalability (which involves tinkering with database schema and writing/changing existing loader programs) and data privacy. Despite these disadvantages several recent projects (RDFScape, Bio2RDF, CardioSHARE, KNO.E.SIS) in life science domain which is based on the state of the art Web technology, the Semantic Web technology, (discussed later) uses centralized model mainly because of the performance limitations in federated approach.

Federated databases and web services

The problems of data warehouse approach are resolved in federated databases, mainly because centralization of data is not a necessity. Federated databases are playing increasingly large role in life science data integration and several databases/Web applications projects have embraced this approach.

With adoption of distributed model it is implicit to expect scattered heterogeneous data resources. There is a need for a technology that is able to automate access to remote resources, manage and link data properly. Web service is such a technology which is employed to address the issues of distributed model mainly concerning application to application communication. A programmatic interface to a resource facilitating application to application communication made available over Web is often referred to as Web service (W3C, 2002 ).

Web services technology uses SOA. SOA architectural model decouples service provider (source) from the service consumer (sink). The goal is to provide a great flexibility in constructing distributed computing systems based on services. This means that service consumer can choose any service from any provider no matter which language is used for its implementation and what platform they run on, as long as the interface is compatible. XML is accepted as a ubiquitous representational language for data integration and interoperability (Achard et al., 2001 ). For the same reason XML-based standards (e.g., SOAP and Web Services Description Language, WSDL) are predominantly used for describing data, services and the communication protocol maintaining interoperability between services (Neerincx and Leunissen, 2005 ). The implicit advantage is that the decoupled nature of the approach provides a means to develop solutions that could keep pace with rapid and dynamic developments in systems biological research. In this setup softwares can evolve separately, made interoperable, easily implemented and scaled (e.g., DAS, Jenkinson et al., 2008 ; Hmida et al., 2005 ). Because of these advantages in terms of flexibility and extensibility many of the biomedical databases have started providing Web service (e.g., NCBI, Sayers et al.; EMBL, DDBJ, BioMoby, Wilkinson and Links, 2002 ; caGRID, Saltz et al., 2006 , pathway commons, Cerami et al., 2006 ; Biomodels, Li et al., 2009 ).

Web services have paved in the evolution of tools which could help to: (1) display and access to integrated content on Web/application interfaces (e.g., jemboss, Carver and Bleasby, 2003 ; SeWeR, Basu, 2001 ; cPath, Cerami et al., 2006 ; e.g., link and view integration, Stein, 2003 ); (2) render complex biological interactive visualizations (pathwayExplorer, Mlecnik et al., 2005 ); (3) automation of Interactive forms to accept data from user (e.g., Xforms, W3C/CWI, 2010 ); and (4) most importantly development of dynamic network of XML-based data pipelines which could be used by analytical tools (e.g., CellDesigner plugins, Funahashi et al., 2007 ; Van Hemert and Dickerson, 2010 ; Cytoscape plugins, cPath 3 ; WikiPathways, 2008 ), including development of advanced suits for automatic workflow generation (e.g., BioMOBY, Wilkinson and Links, 2002 ).

In a distributed model, researchers are often needed to access several services to accomplish a useful task. Often researchers face interoperability of the services as a major problem. In response, they resort in creation of their own workflow by fetching data from one source, usually reformat it, submit it to a service of another source, parse the results, reformat again and resubmit. This endeavor will continue till an acceptable end result is accomplished. Many projects have tried to solve the interoperability problems by developing specialized platforms called grids (e.g., caGRID, Saltz et al., 2006 ; PathGrid, Arbona et al., 2007 ; Walton et al., 2010 ; The Virtual Kidney, Harris et al., 2009 ; Abramson et al., 2010 ; GEMSS, Benkner et al., 2005 ). At its conception the “Grid” was envisioned as a distributed and cost-effective solution to boost computational power to solve large-scale mathematical and data-bound problems. Current mature understanding of the grid is more as a robust framework (mostly based on principle of SOA) for performing distributed computing tasks on the scale of internet which can enable service-oriented science (Foster, 2005 ). The gird services can provide either a HTML based interface (classic grids) or much advanced SOAP interface. SOAP interface is widely accepted, because of several advantages over classic grids (Neerincx and Leunissen, 2005 ). In a grid, services are distributed over many servers, and clients use specialized software to discover and execute these services. Usually a grid uses middleware that uses wrappers around existing programs to create a standard application programming interface (API) for communication between services. Number of useful user-friendly tools have been developed to support the grid platform including graphical workflow management tools (e.g., Taverna, Oinn et al., 2006 ), schedulers (e.g., GridFlow, Bo et al., 2005 ), and script translation tools (GridAnt, Amin et al., 2004 ; SQUID, Carvalho et al., 2005 ).

Another technology which is used for building distributed systems is CORBA. CORBA is tightly coupled, object centric and stateful. In comparison, Web services are loosely coupled, utilize a message exchange model and are stateless. This difference gives Web services a flexibility and simplicity to implement distributed system which is not seen in CORBA implementation. However CORBA may interoperate and coexist with Web services similar to grid. That means CORBA has a similar architecture as Grid to work with Web services, but important distinction is that Web services are integral part of grid, while it is not a native component of CORBA.

Another important advantage of grids is its ability to leverage on existing IT infrastructure to optimize usage and sharing of computational resources and manage large amounts of data. The rationale behind grid technology is similar to eclectic power grid where users do not know the details of the technology and the sources. They simply connect to common interfaces, subscribe and consume what they need.

One of the main problems with grid computing infrastructure is over-provisioning of computational resource. For this reason grids are suited for data-intensive task and would not be economically feasible for small tasks. Recently Cloud computing has evolved from grid computing and provides on-demand resource provisioning (Protein Bioinformatics Infrastructure for the Integration and Analysis of Multiple High-Throughput “omics” Data 4 ). Readers are encouraged to refer to reviews by Bateman et al. (Bateman and Wood, 2009 ; Martin, 2001 ) and Stein ( 2010 ) for detailed discussion of cloud computing and its potential applications in biological field. The consumers of cloud service need not own the infrastructure, software or platform in the cloud and need not care about how they are maintained. The scaling of computational resources is dynamic (on demand) and easy because of virtualization technology. For these reasons like many others, we expect that a wide adoption of cloud computing for systems research wanting varying demand for computational resources, and cost effectiveness. Already we can witness its adoption in few of the genomics initiatives (Langmead et al., 2009 ;Baker, 2010 ; Schatz et al., 2010 ).

Availability of SOAP/WSDL based Web services merely gives programmatic access to databases and Web applications. SOAP and WSDL alone is incapable of self-describing the services and data, thus machines can not anticipate the meaning of appropriate services and their interfaces. This is a major problem for data integration using Web services because of Semantic heterogeneity. Semantic heterogeneity is caused by disagreement about the meaning, interpretation, or intended use of the same or related data and services. Since Web service architecture lacks semantics realizing functionalities like automated service discovery, mediation and reuse of services is not possible. These functionalities are essential for linking Web services and creating service pipelines, for enabling efficient and more meaningful use of Web services. Such pipelines help users to explore and manipulate data, automate analysis and solve complex biological tasks. In response ontology based solutions (Semantic Web services) were developed and are used in several successful projects [e.g., caCORE, Komatsoulis et al., 2008 ; BioMOBY(S-MOBY), Wilkinson et al., 2005 ; myGRID workflows using BioMOBY services, Kawas et al., 2006 ; DiscoverNet, Ghanem et al., 2002 and TAMBIS, Stevens et al., 1999 ]. Projects that couple semantics with the Web services functions more effectively as it eliminates critical interoperability problems, which commonly surfaces when only Web services are used. They define a registry-based discovery system based on Universal Description, Discovery and Integration protocol (UDDI, is a standard by OASIS standard consortium, to create service directories that enable applications to dynamically find and use Web services UDDI 5 ). Each of the projects listed here addresses the interoperability problems using Semantic Web services differently. We encourage you to refer to a review by Good and Wilkinson ( 2006 ) which describes the technology and motivation behind using Semantic Web services in some of the above projects.

Performance of Web service is also a concern confronted especially integrating large dataset. The performance overhead is introduced by SOAP message size (XML text) and complexity, XML parser, cost of serialization and deserialization, cost of connection establishment, security validation, UDDI registration and querying of XML. One of the efficient ways to improve performance of Web services is by minimizing communication delays. This could be achieved by using compression formats like binary XML which has shown to provide performance boost. Another societal solution to this end is by community agreeing on one data representation format which will prevent unnecessary, inefficient conversions between formats. For example pathway data available from public databases made available primarily in one of the following formats: cellML, SBML, and BioPax. To integrate all the available pathways one has to resort to inefficient conversions (using converters like CellML2SBML, SBML2BioPax) between these formats. This especially in real time is overkill.

Ontology and semantic web technology (SWT)

Semantic Web Technology deals with the meaning of information, enables computers to understand the Web content, perform tedious task like finding and assembling knowledge from multiple sources on the Web. In the above sections we have already discussed the use of SWT in context of discovery of Web services. The most important scenario where Semantic Web matters is for identifying semantically related (having same meaning) concepts from different resources. Having a mapping between related concepts in database will help to query multiple databases with a single query. Also it should be possible to automatically identify and map the various data fragments creating rich information spaces that can be explored for new knowledge.

The SWT thus focuses on using: (1) ontologies to explicitly specify the domain concepts; (2) standard representation languages [e.g., Resource Description Framework (RDF), 2010 6 ; RDF schema (RDFS), RDF Vocabulary Description Language 1.0: RDF Schema (RDFS), 2004 7 ; Web Ontology Language (OWL), 2004 8 ] to name, encode, describe, combine information; (3) standard Web protocols to access the information (e.g., Query language for RDF, SPARQL, 2008 9 ); and (4) technologies to leverage on computational task like inference and distributed query (inference engine, Ruttenberg et al., 2009 ). We list here several interesting reviews (Tyrelle, 2005 ; Wang et al., 2005 ; Good and Wilkinson, 2006 ; Post et al., 2007 ; Ruttenberg et al., 2007 ; Sagotsky et al., 2008 ; Antezana et al., 2009 ) covering in detail about SWT.

Ontology is defined as a shared vocabulary plus a specification of its intended meaning (Guarino, 1998 ). Thus ontologies unambiguously represent concepts that are known and are necessary in Semantic Web for resolution of naming conflicts. Since SWT is depended on the ontologies there is a need to rapidly develop globally accepted quality ontologies. Several ontologies [e.g., Gene ontology (GO), Ashburner et al., 2000 ; Cell cycle ontology, mammalian Phenotype ontology, Antezana et al., 2009 ; SBO, Le Novere, 2006 ; BioPAX, 2006 ; pathway ontology 2010 10 ; Event ontology, Kushida et al., 2006 ] now are integral part of any biological and systems driven research. The details of each of these ontologies and many others can be found on Global open biological ontologies Web site (The Open Biological and Biomedical Ontologies 11 ). Alternatively use Bioportal (Noy et al., 2009 ), a Web repository for biomedical ontologies to access and share ontologies. We request you to refer to useful reviews (Bard and Rhee, 2004 ; Jurisica et al., 2004 ; Puustjarvi and Puustjarvi, 2009 ) on ontologies for knowledge management in biomedical field.

The Semantic Web has adopted basically three formal languages that are based on the mathematical graph model and are machine readable: (1) RDF usage makes information essentially self-describing. RDF represents data by making statements called “triples” in the form of a subject-predicate-object. Triples are analogous to a “complete sentence” consisting of subject and predicate (fine verb and object), which forms basic building blocks of expression in natural language. Collection of sentences can make paragraph, similarly, collection of triples (even spanning multiple documents) can form networks of interconnected logical graphs that describe information nodes and their interrelationships with other nodes essentially integrating information. Also OWL is built on RDF; (2) RDFS, is a framework overlaid on RDF graph model to specify a standard way to describe a resources represented in RDF, with in a particular domain. While RDF provides model and syntax for describing resources, but by itself is incapable of defining the meaning of those resources. For example RDFS can be used to create vocabularies to describe anything from diseases to molecules, experiments, instrument to even abstract concepts like consciousness and behind; and (3) OWL is used to define content of the information by defining the types of objects, their vocabulary and their relationships in an RDF document. OWL facilitates greater machine interpretability of Web content than that supported by XML, RDF and RDFS by providing extended vocabulary along with formal semantics. OWL has three increasingly expressive sublanguages: OWL Lite, OWL-DL (description logic), and OWL Full. Basis for OWL-DL is first order description logic. This means that OWL-DL ontology is expressed in formalism with well-defined semantics and over which automated reasoning can be undertaken. In our view, OWL is ideal language to capture knowledge in terms of Ontologies (Stevens et al., 2007 ) and can satisfy most of the requirements in systems driven research.

Semantic Web languages rely on unique identity (a global identifier) through use of Uniform Resource Identifiers (URI, URI Interest Group, 2005 12 ). URI is a string of characters used to identify a name or a resource on the internet. So URI identifies a resource either by location (Universal Resource Location, URL) or name (Universal Resource Name, URN). Such identification enables interaction with representation of the resource over a network (World Wide Web) using specific protocols. Also, it is by mapping URI for an object in two separate Semantic documents (RDF documents) one is able to integrate the information together. Thus SWT using URIs can help us to overcome data integration problems with fewer efforts.

Semantic Web Technology are increasingly gaining acceptance in biological community and several projects have spanned like Bio2RDF, RDFScape (Splendiani, 2008 ), YeastHub (Cheung et al., 2005 ), BioPAX and semanticSBML (Krause et al., 2010 ) to name the few. More and more biological data and ontologies are now made available in RDF (e.g., AlzPharm, Lam et al., 2007 ), UniportRDF and OWL (e.g., GOOWL, Ashburner et al., 2000 ; biOzen, UMLS ontology MGED OWL). To support progress of Semantic Web technology, several softwares are developed by open-source developers and made available through a centralized Web portal SemWebCentral 13 . Some of the open-source efforts that we would particularly point you to is for storage and retrieval solutions for RDF triples supported by SPARQL for querying (e.g., Virtuoso, Sesame, 3store, Harris and Gibbins, 2003 ; Mulgara) and frameworks for building Semantic Web applications including rule-based inference engine (e.g., Jena, MobyServlet, Gordon et al., 2007 ).

Technologies with social networking: web2.0

Technologies of data integration discussed till now uses data exchange between computer and its user. Concept of data sharing and collaboration among user is not part of such integration facility. Web2.0 is the technology based on user-computers-user collaborative model, and transcends traditional data integration technologies; giving it an edge to become a more suitable platform for enhancing systems research. The Web2.0 is conceived as a social, collaborative and collective Web space (Kamel Boulos and Wheeler, 2007 ; Zhang et al., 2009 ). Unlike Web1.0 which is “read only;” merely meant to display information on the Web, Web2.0 and 3.0 is “read and write,” where user constantly interacts with the Web and works in a networked setting.

Key characterizing elements of Web2.0 are social Web, user added value, use of Web service and software as service (Zhang et al., 2009 ). The Social Web and user added value emphasizes on the need to connect people and use the collective power of community to achieve data integration tasks. Wikipedia is an ideal example for a successful Social Web project in which the content is both created and edited by users. Owing to wikis success there are considerable efforts in biological community for porting Wikipedia into biological domain (e.g., Gene Wiki, Huss et al., 2008 ; WikiProtein, Mons et al., 2008 ; Wiki Pathway, Pico et al., 2008 ).

In order to facilitate interaction between the user and the computers on the Web, both data and interface used by user should be dynamic. Two of the most powerful programming techniques which have facilitated such interactivity on Web (by building rich client applications) are AJAX (Asynchronous JavaScript and XML) and Flash. Both AJAX and Flash are equally good for creating interactive Web application. We and similarly many biological projects have used AJAX in several of our projects basically because it is an open-source initiative and it by default is supported by almost all the Web browsers. In a typical Web technology client makes a request to server and waits for the return of response by the server and then takes some action on the content sent, before client can make another request. Unlike, using AJAX clients are allowed to continuously interact with the server without having to wait for the immediate response to return for each request made. The processing of transactions happens in background which facilitates exchange of messages between client and server without any interruptions. Web application designed on such rich interactivity are increasingly used for presentation of data, rendering complex and dynamic visualizations, context dependent user-friendly search and browsing interfaces and developing context data submission and feedback forms. One of the recent project that could be valuable for systems biology community is Payao (Matsuoka et al., 2010 ) built using Flash. Payao is a community-based, collaborative Web service platform for model curation, tracking updates and tagging system. Users can collaboratively engage in model building and curation processes. Payao supports standard representations like SBML (Systems Biology Markup Language, Finney and Hucka, 2003 ) as input and output data format and SBGN (Systems Biology Graphical Notation, Novère et al., 2009 ) for visualization of the model.

Technological implications and advantages of Web services as a means to support computer-to-computer interaction were discussed earlier. Web2.0 caters to extensive use of Web services to develop interoperability between data resources and software through exposing Web application programming interfaces (Web APIs) described using WSDL. Providing Web services based pipeline will help users to explore and manipulate data, mix and match Web services that can use the data in variety of novel ways and solve complex problems. Launching several useful tools and algorithms that are often used for analysis, data management and visualization similar to Google docs, 2010 14 , as Web applications and services (software as a service, SaaS) will help the scientific community in multiple ways. It is better and advantageous way, doing work online constantly connected, collaborating and sharing information. Software can be constantly improved in response to user feedback and needs. It also eliminates software platform dependency enabling use on diverse operating platforms (e.g., UNIX, Mac, Windows and Android) and need tedious task of local installations. All this eases data assembling and integration from heterogeneous data sources and as a result Web2.0 is likely to promote discovery of new knowledge.

Future Opportunities, New Challenges and Recommendations

The role of modeling and simulation in the Systems driven analysis of living systems is now clearly established. Emerging disciplines, such as systems biology, and other worldwide research initiatives, such as the Physiome project (Hunter et al., 2008 ) and the Virtual Physiological Human project (Fenner et al., 2008 ; Hunter et al., 2010 ), are based on an intensive use of modeling and simulation methodologies and tools. One of the key aspects in this context is to perform an efficient integration of various models representing different biological/physiological functions, at different levels of organization spanning through different scales. To handle such complex integration challenges and improve our ability to conduct biologically meaningful system scale analysis require a unique, interoperable, universal information framework with the following characteristics (Boyle et al., 2008 ): (1) Unique identification and dynamic data resolution with capabilities to track data provenance; (2) Services to manipulate data (e.g., relationship services, synonym services, query service, registry service, ontology service); (3) Services to analyze the data (e.g., to run inferential analysis, statistical and mathematical analysis, simulation services); (4) Services for data presentation and visualization (e.g., rendering complex interactive networks and pathways, online collaborative modeling); (5) Semantic Web enabled (common data syntax, shared semantics and Semantic discovery) with capabilities to access data and Semantics using same mechanism (e.g., dynamic discovery of services); and (6) implementation should be sufficiently robust and portable to allow use by researchers with a wide variety of backgrounds and computing expertise. A data/Web infrastructure using such frameworks will be scalable with efficient data handling potential. Number of novel solutions for developing information frameworks with the above functionalities has been proposed in terms of prototype systems (e.g., Simple Sloppy Semantic Database (S3DB); Almeida et al., 2006 ), design methodologies for databases (Maier et al., 2009 ), and software (e.g., I-cubed, Boyle et al., 2008 ; generative software development, Nord and Czarnecki, 2004 ). However as of now, no single framework is implemented that can support all/most of the listed functionalities.

Like many other research groups, we also agree that future of systems research requires semantically based data integration through ontologies. Putting data into easily accessible repository in the standardized format is essential part of realizing Semantic Web vision. There is a need for creation of tools, databases and Web applications that makes creating, publishing and searching of RDF/OWL intuitive and simple for biomedical researchers and clinicians. To this end existing databases should actively participate in representing their data in Semantic languages like RDF/OWL and provide Web services. The potential benefits of SWT are being realized increasingly in life science and health care community. There is work in progress trying to address the issues, develop, and support the use of SWT through internationally organized efforts (e.g., Semantic Web Health Care and Life Sciences Interest Group, HCLSIG 15 initiated by W3C).

Just as any evolving new technology, Semantic Web is also full of issues. Two particular prevailing problem of Semantic Web is the ambiguous identification of resources and vagueness of resource definition (Wang et al., 2005 ). This means that different ontologies often refer to the same concept with different URIs and a particular resource could have ambiguous descriptions. This issue of ambiguity calls for URI resolution steps, while integrating data. Since biologists who are involved in ontology development are focused on Semantic rather than the Web technology. Just creation of ontologies will be of little use if developed ontological concepts are kept hidden (inaccessible) or redundantly identified in the Semantic Web. It is extremely important to provide explicit access to each ontological concept via resolvable URIs. Either ontology providers should build concept resolution system by themselves or they should make the ontologies available to consortiums that have infrastructure to launch ontology resolution services. Also URI harmonization strongly requires both technical and social collaboration. Currently several proposal for standardized identification of biological entities and relationship (e.g., life science identifier, LSID; Martin et al., 2005 ; URI based, URI Interest Group, 2005 12 ; MIRIAM URIs, Laibe and Le Novere, 2007 ) are put up. However there is no consensus yet reached to use a common accepted identification scheme. The Semantic community should work toward a possible agreement of using an explicit identification system that could help unambiguously specify biological resources. To address this problem, efforts by Shared Names initiative(2009) 16 is a valuable beginning to normalize URI's in the biomedical context. Having met the above requirements, in future, SWT will receive wide adoption with in systems driven research community.

Current representative applications of Semantic Web are SWEDI (Post et al., 2007 ) and an example system developed by HCLSIG 15 ; at present they both manage issues of data integration but not much work is undertaken implementing data analysis and interpretation functionalities. The reason lies in practical difficulties implementing such mechanisms to extract meaningful knowledge from raw integrated data. Very few recent projects have tried to leverage the capabilities of inference technology. The RDFScape (Splendiani, 2008 ), is one of the few recent work in life sciences domain, which has attempted to use reasoners (program which can determine relations among ontology classes) on BioPAX ( 2006 ) data inside Cytoscape (Shannon et al., 2003 ). In another novel study a different approach using Semantic Web methodologies to integrate gene data with phenotype data was demonstrated. It used RDF graph (network) analysis with reasoners to prioritize candidate cardiovascular disease genes (Gudivada, 2007 ). However all the projects that have used inference technology are tried on the data integrated from limited number of resources, but not on Web-scale datasets.

Ideally where RDFS and OWL constructions are used, it should be possible to apply automated reasoning over data schema and innovate meaningful knowledge. However at present Semantic Web is not completely ready to equip with inference engines. One of the potential reason is problems posed by large ontologies [e.g., Unified Medical Language Systems (UMLS), 2004 17 , NCI thesaurus (NCIt, 2010) 18 , GO, Ashburner et al., 2000 ; Good and Wilkinson, 2006 ]. It is currently unfeasible to retrieve, modify and process concepts at runtime as conceived for their utilization on Semantic Web. Because current reasoners and other tools that support Semantic Web require that all the information that they process should be loaded into memory. This can severely curtail performance or even fail when scaled to large ontologies. Several algorithms have been proposed to decompose large ontologies into less manageable and meaningful pieces retaining some of the semantics of the full version. Researchers can then further modify these ontology models to their specific needs (for example reducing the complexity and/or cleaning up inconsistencies) creating an inference feasible versions. However multiple limitations still persist, requiring future research aimed in this direction.

In previous section ( Federated Databases and Web Services ) we discussed how Web services can help to develop federated systems that could keep pace with the rapid advances in systems research. But, numerous issues associated with Web services could hinder the progress. Issues related to maintenance of code that might affect the scalability and ease of development when Web services are built on SOA. Newer software development paradigm like Aspect Oriented Programming (AOP) solves the problems associated with code-tangling and code-scattering (Kiczales et al., 1997 ). But, adoption of AOP is yet to percolate into life science development stream.

Ontologically described Web service interfaces are not yet completely available which need to be addressed for realization of automatic discovery of services. A limitation of WSDL and SOAP is it being purely syntactical cannot express the semantics of underlying data and services which renders them inaccessible by machine. Adding semantics to represent the requirements and capabilities of Web services is essential for achieving unambiguity and machine interpretability. Work on automatic higher level integration of Web services and data by machine is in its incipient stage and progressing slow. The reason underlying slow implementation of such a useful infrastructure is: (1) it presupposes a presence of formal logic over Web resources (i.e., Web represented in semantic languages, McIlraith et al., 2001 ); (2) at present Web services uses WSDL, which lack a way to describe semantics. Recently, WSDL-S (semantic markup of Web services description language) was proposed as an alternative solution to the problem (Miller et al., 2004 ); and (3) it could be attributed to lack of efficient Web crawlers to index Web service description analogous to what Google does for Web pages. The value additions that a Web service search engine can provide could be witnessed in BioCatalogue (CA and D.R.D, 2008) and BioMoby (Wilkinson et al., 2005 ) projects that are manually curated search engines for Web services.

Adoption of Web3.0 will revolutionize the way we manage data online, exchange information with each other and discover knowledge from rapidly accumulating biological data through collective intelligence. The Web3.0 is defined by Wiktionary as: “The predicted third generation of the World Wide Web usually conjectured to include semantic tagging of content” (Wiktionary: a wiki-based open content dictionary, 2010 19 ). This means Web3.0 is an extension of second generation Web2.0 technology and is semantic aware Web technology with each service being closely coupled with a formalized description. The range of tools (inference searches, ontologies, SPARQL) envisioned as part of Semantic Web will be available in Web3.0. Conceptually in Web3.0 entire Web is viewed as one large integrated database. Allowing structured information to be read by different programs across the Web and enabling users to do more accurate searches and finding precisely what they want.

Scientific workflows are ideal for in silico experimentation in advancing systems research. Several of the workflow systems are significantly developed including the Taverna workflow workbench (Oinn et al., 2006 ), Kepler (Ludäscher et al., 2006 ), Triana (Taylor et al., 2007 ), and Pegauss (Deelman et al., 2005 ). Workflows include number of master services described in WSDL file that coordinate or aggregate activities together. Some of the workflow can be highly scalable to span multiple domains and organizations dispersed in different geographical locations. In such a scenario user interactions with the workflows at several intermittent levels is preferable facilitating interactive steering and monitoring. This will give user, control over exception handling, monitoring data, and choosing alternative work paths (steering) depending on the results witnessed at runtime. Workflow management will be especially crucial for computational intensive and long-running workflows and services which are typically encountered in in silico systems scale analysis. Further, Business Processes Enterprise Language (BPEL) was recently proposed; it has several key advantages to specify scientific workflows in a distributed computational setup (Akram et al., 2006 ; Tan et al., 2010 ).

To succeed, systems research must be a collaborative, cross-disciplinary and a broad organizational endeavor similar to successful initiative like Alzforum (Lam et al., 2007 ), MyExperiment (Goble et al., 2010 ), and Payao (Matsuoka et al., 2010 ). Several tasks integral to systems research (such as ontology development, URI standardization, developments of tools) requires involvement of scientists from different backgrounds including biologist, physicians, computer scientist, mathematicians and statisticians. To facilitate this type of multidisciplinary interaction, certain prevailing challenges must be met including: (1) adoption of machine-readable data representation formats including semantically aware formats; (2) workflows to address data quality and integrity; (3) implementation of resource identity; and (4) tracking of provenance and ownership. The vision of the Web2.0 and particularly Web3.0 wraps these ideas in to framework of Semantic Web. Useful collaborative Web tools and applications like wiki, blogs, mashups, and light weight Web apps for integration of distributed Web resources on demand will be made available for systems driven research community fostering active participation and an opportunity to take advantage of its integrative and analytical potential.

Concluding Remarks

We have discussed the current state of the art approaches and technologies and open issues of database and Web application implementation in context of systems driven research. First we provided issues associated with the data integration and then we discussed how these issues have been tackled. And, finally, we discussed the corresponding open issues and their possible solutions. Despite considerable progress in appropriate technologies and efforts to establishing an efficient computational platform, the integration of biological data to meet systems driven research will remain a challenging problem for both present and conceivable future. We need to stay attuned to three important aspects that drive the field: science, technology and society. Only by a consorted effort and support by all players of research community like database providers, funding agency, experimental and theoretical biologist, we will be able to bring revolution in systems driven research. To this end selecting and implementing the most appropriate technology is of paramount importance.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

This work was supported by the following grants: (1) the NRF grant (2010-0002159) funded by Korean Ministry of Education, Science and Technology (MEST) and (2) the Gwangju Institute of Science and Technology (GIST) Systems Biology Infrastructure Establishment Grant (2010).

1 http://www.wikipedia.org/

2 http://www.w3.org/Graphics/SVG/

3 Cytoscape cPath PlugIn

4 http://www.hindawi.com/journals/abi/2010/

5 http://www.uddi.org423589.cta.html

6 http://www.w3.org/RDF/FAQ

7 http://www.w3.org/TR/rdf-schema/

8 http://www.w3.org/TR/owl-features/

9 http://www.w3.org/TR/rdf-sparql-query/

10 http://sourceforge.net/projects/pathwayontology/

11 http://www.obofoundry.org/

12 http://www.w3.org/2001/12/URI/

13 http://www.semwebcentral.org/?page_id = 12

14 http://www.google.com/accounts/ServiceLogin?service=writely&passive=1209600&continue=http://docs.google.com/&followup=http://docs.google.com/&ltmpl=homepage

15 http://www.w3.org/2001/sw/hcls/

16 Available on: http://sharedname.org/page/Main_Page

17 http://www.nlm.nih.gov/research/umls/about_umls.html

18 Available on: http://ncit.nci.nih.gov/

19 http://en.wiktionary.org/wiki/Web_3.0

  • Abramson D., Bernabeu M. O., Bethwaite B., Burrage K., Corrias A., Enticott C., Garic S., Gavaghan D., Peachey T., Pitt-Francis J., Pueyo E., Rodriguez B., Sher A., Tan J. (2010). High-throughput cardiac science on the Grid . Philos. Transact. R. Soc. A Math. Phys. Eng. Sci. 368 , 3907–3923 [ PubMed ] [ Google Scholar ]
  • Achard F., Vaysseix G., Barillot E. (2001). XML, bioinformatics and data integration . Bioinformatics 17 , 115–125 10.1093/bioinformatics/17.2.115 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Akram A., Meredith D., Allan R. (2006). “Evaluation of BPEL to scientific workflows,” in Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid IEEE Computer Society [ Google Scholar ]
  • Almeida J. S., Chen C., Gorlitsky R., Stanislaus R., Aires-de-Sousa M., Eleuterio P., Carrico J., Maretzek A., Bohn A., Chang A., Zhang F., Mitra R., Mills G. B., Wang X., Deus H. F. (2006). Data integration gets “Sloppy” . Nat. Biotechnol. 24 , 1070–1071 10.1038/nbt0906-1070 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Amin K., Von Laszewski G., Hategan M., Zaluzec N. J., Hampton S., Rossi A. (2004). “GridAnt: a client-controllable grid workflow system,” in Proceedings of the Hawaii International Conference on System Sciences, Big Island, Hawaii 10.1109/HICSS.2004.1265491 [ CrossRef ] [ Google Scholar ]
  • Antezana E., Egaña M., Blondé W., Illarramendi A., Bilbao I., De Baets B., Stevens R., Mironov V., Kuiper M. (2009). The cell cycle ontology: an application ontology for the representation and integrated analysis of the cell cycle process . Genome Biol. 10 , R58.01–R58.19 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Antezana E., Kuiper M., Mironov V. (2009). Biological knowledge management: the emerging role of the semantic web technologies . Brief. Bioinform. 10 , 392–407 10.1093/bib/bbp024 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Arbona A., Benkner S., Engelbrecht G., Fingberg J., Hofmann M., Kumpf K., Lonsdale G., Woehrer A. (2007). A service-oriented grid infrastructure for biomedical data and compute services . IEEE Trans. Nanobiosci. 6 , 136–141 10.1109/TNB.2007.897438 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Asai Y., Suzuki Y., Kido Y., Oka H., Heien E., Nakanishi M., Urai T., Hagihara K., Kurachi Y., Nomura T. (2008). Specifications of insilicoML 1.0: a multilevel biophysical model description language . J. Physiol. Sci. 58 , 447–458 10.2170/physiolsci.RP013308 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ashburner M., Ball C. A., Blake J. A., Botstein D., Butler H., Cherry J. M., Davis A. P., Dolinski K., Dwight S. S., Eppig J. T., Harris M. A., Hill D. P., Issel-Tarver L., Kasarskis A., Lewis S., Matese J. C., Richardson J. E., Ringwald M., Rubin G. M., Sherlock G. (2000). Gene ontology: tool for the unification of biology. The gene ontology consortium . Nat. Genet. 25 , 25–39 10.1038/75556 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Baker M. (2010). Next-generation sequencing: adjusting to data overload . Nat. Methods 7 , 495–499 10.1038/nmeth0710-495 [ CrossRef ] [ Google Scholar ]
  • Bard J. B., Rhee S. Y. (2004). Ontologies in biology: design, applications and future challenges . Nat. Rev. Genet. 5 , 213–222 10.1038/nrg1295 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Basu M. K. (2001). SeWeR: a customizable and integrated dynamic HTML interface to bioinformatics services . Bioinformatics 17 , 577–578 10.1093/bioinformatics/17.6.577 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Bateman A., Wood M. (2009). Cloud computing . Bioinformatics 25 , 1475. 10.1093/bioinformatics/btp274 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Benkner S., Berti G., Engelbrecht G., Fingberg J., Kohring G., Middleton S. E., Schmidt R. (2005). GEMSS: grid-infrastructure for medical service provision . Methods Inf. Med. 44 , 177–181 [ PubMed ] [ Google Scholar ]
  • BioPAX (2006). Pathway Exchange Language for Biological Pathway Data Available at: www.biopax.org
  • Biowisdom (2010). Public SRS Installations Available at: http://www.biowisdom.com/download/srs-parser-and-software-downloads/public-srs-installations/
  • Bo C., Liu Q., Yang G. (2005). Distributed gridflow model and implementation . Lect. Notes Comput. Sci. 3379 , 84–87 10.1007/11577188_12 [ CrossRef ] [ Google Scholar ]
  • Boyle J., Cavnor C., Killcoyne S., Shmulevich I. (2008). Systems biology driven software design for the research enterprise . BMC Bioinformatics 9 , 295 10.1186/1471-2105-9-295 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Brooksbank C., Cameron G., Thornton J. (2005). The European bioinformatics institute's data resources: towards systems biology . Nucleic Acids Res. 33 , D46–D53 10.1093/nar/gki026 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Burrage K., Hood L., Ragan M. A. (2006). Advanced computing for systems biology . Brief. Bioinform. 7 , 390–398 10.1093/bib/bbl033 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cardiac Integrated Database Management System-Pathway Database (CIDMS-PD) (2009). An Innovative Systems Biology Knowledgebase for Cellular Pathways in Heart Available at: http://www.icsb-2009.org/schedule_details.php?ID=272
  • Carvalho P., Gloria R., de Miranda A., Degrave W. (2005). Squid – a simple bioinformatics grid . BMC Bioinformatics 6 , 197 10.1186/1471-2105-6-197 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Carver T., Bleasby A. (2003). The design of Jemboss: a graphical user interface to EMBOSS . Bioinformatics 19 , 1837–1843 10.1093/bioinformatics/btg251 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cerami E. G., Bader G. D., Gross B. E., Sander C. (2006). cPath: open source software for collecting, storing, and querying biological pathways . BMC Bioinformatics 7 , 497 10.1186/1471-2105-7-497 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Cheung K.-H., Yip K. Y., Smith A., deKnikker R., Masiar A., Gerstein M. (2005). YeastHub: a semantic Web use case for integrating data in the life sciences domain . Bioinformatics 21 , i85–i96 10.1093/bioinformatics/bti1026 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • CIDMS (2007). Cardiac Integrated Database Management System for Cardiac Systems Biology Available at: www.icsb-2007.org/proceedings/abstracts/G06.pdf.
  • CIDMS-PD (2009). An Innovative Systems Biology Knowledgebase for Cellular Pathways Available at: http://cidms.org/pathways.
  • Clark T., Jurek J., Kettler G., Preuss D. (2005). A structured interface to the object-oriented genomics unified schema for XML-formatted data . Appl. Bioinformatics 4 , 13–24 10.2165/00822942-200504010-00002 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ćurčin V., Ghanem M., Guo Y., Köhler M., Rowe A., Syed J., Wendel P. (2002). “Discovery net: toward a grid of knowledge discovery,” in Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM, Edmonton [ Google Scholar ]
  • Czauderna T., Klukas C., Schreiber F. (2010). Editing, validating, and translating of SBGN maps . Bioinformatics 26 , 2340–2341 10.1093/bioinformatics/btq407 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Deelman E., Singh G., Su M. H., Blythe J., Gil Y., Kesselman C., Mehta G., Vahi K., Berriman G. B., Good J., Laity A., Jacob J. C., Katz D. S. (2005). Pegasus: a framework for mapping complex scientific workflows onto distributed systems . Sci. Program. 13 , 219–237 [ Google Scholar ]
  • Fenner J. W., Brook B., Clapworthy G., Coveney P. V., Feipel V., Gregersen H., Hose D. R., Kohl P., Lawford P., McCormack K. M., Pinney D., Thomas S. R., Van Sint Jan S., Waters S., Viceconti M. (2008). The EuroPhysiome, STEP and a roadmap for the virtual physiological human . Philos. Transact. R. Soc. A Math. Phys. Eng. Sci. 366 , 2979–2999 [ PubMed ] [ Google Scholar ]
  • Finney A., Hucka M. (2003). Systems biology markup language: level 2 and beyond . Biochem. Soc. Trans. 31 , 1472–1473 10.1042/BST0311472 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Foster I. (2005). Service-oriented science . Science 308 , 814–817 10.1126/science.1110411 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Funahashi A., Jouraku A., Matsuoka Y., Kitano H. (2007). Integration of CellDesigner and SABIO-RK . In Silico Biol. 7 , S81– S90 [ PubMed ] [ Google Scholar ]
  • Gehlenborg N., O'Donoghue S. I., Baliga N. S., Goesmann A., Hibbs M. A., Kitano H., Kohlbacher O., Neuweger H., Schneider R., Tenenbaum D., Gavin A. C. (2010). Visualization of omics data for systems biology . Nat. Methods 7 , S56–S68 10.1038/nmeth.1436 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Gentleman R. C., O'Donoghue S. I., Baliga N. S., Goesmann A., Hibbs M. A., Kitano H., Kohlbacher O., Neuweger H., Schneider R., Tenenbaum D., Gavin A. C. (2004). Bioconductor: open software development for computational biology and bioinformatics . Genome Biol. 5 , R80.01–R80.16 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Goble C. A., Bhagat J., Aleksejevs S., Cruickshank D., Michaelides D., Newman D., Borkum M., Bechhofer S., Roos M., Li P., De Roure D. (2010). MyExperiment: a repository and social network for the sharing of bioinformatics workflows . Nucl. Acids Res 38 , W677–W682 10.1093/nar/gkq429 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Goble C. A., De Roure D. (2008). Curating Scientific Web Services and Workflows . Educause 43 , 1527–6619 [ Google Scholar ]
  • Good B. M., Wilkinson M. D. (2006). The life sciences semantic web is full of creeps! . Brief. Bioinform. 7 , 275–286 10.1093/bib/bbl025 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Gordon P. M., Trinh Q., Sensen C. W. (2007). Semantic web service provision: a realistic framework for bioinformatics programmers . Bioinformatics 23 , 1178–1180 10.1093/bioinformatics/btm060 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Guarino N. (1998). “Formal ontology in information systems,” in Proceedings of the 1st International Conference June 6–8, 1998. Trento: IOS Press [ Google Scholar ]
  • Gudivada R. (2007). “A genome – phenome integrated approach for mining disease-causal genes using Semantic Web,” in Health Care and Life Sciences Data Integration for the Semantic Web, Sixteenth International World Wide Web Conference (WWW2007) Workshops, Athens, GA [ Google Scholar ]
  • Haas L. M., Schwarz P. M., Kodali P., Kotlar E., Rice J. E., Swope W. C. (2001). DiscoveryLink: a system for integrated access to life sciences data sources . IBM Syst. J. 40 , 489–511 10.1147/sj.402.0489 [ CrossRef ] [ Google Scholar ]
  • Haider S., Ballester B., Smedley D., Zhang J., Rice P., Kasprzyk A. (2009). BioMart Central Portal – unified access to biological data . Nucl. Acids Res. 37 , W23–W27 10.1093/nar/gkp265 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Harris P. J., Buyya R., Chu X., Kobialka T., Kazmierczak E., Moss R., Appelbe W., Hunter P. J., Thomas S. R. (2009). The Virtual Kidney: an eScience interface and grid portal . Philos. Transact. R. Soc. A: Math. Phys. Eng. Sci. 367 , 2141–2159 10.1098/rsta.2008.0291 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Harris S., Gibbins N. (2003). 3Store: Efficient Bulk RDF Storage. PSSS . CEUR-WS.org
  • Hey T., Trefethen A. (2003). e-Science and its implications . Philos. Transact. R. Soc. Lond, A: Math. Phys. Eng. Sci. 361 , 1809–1825 10.1098/rsta.2003.1224 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hmida M. M. B., Tomaz R. F., Monfort V. (2005). “Applying AOP concepts to increase web services flexibility,” in Proceedings of the International Conference on Next Generation Web Services Practices IEEE Computer Society [ Google Scholar ]
  • Hunter P., Coveney P. V., de Bono B., Diaz V., Fenner J., Frangi A. F., Harris P., Hose R., Kohl P., Lawford P., McCormack K., Mendes M., Omholt S., Quarteroni A., SkÃ¥r J., Tegner J., Randall Thomas S., Tollis I., Tsamardinos I., van Beek J. H. G. M., Viceconti M. (2010). A vision and strategy for the virtual physiological human in 2010 and beyond . Philos. Transact. R. Soc. A: Math. Phys. Eng. Sci. 368 , 2595–2614 10.1098/rsta.2010.0048 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Hunter P. J., Crampin E. J., Nielsen P. M. F. (2008). Bioinformatics, multiscale modeling and the IUPS physiome project . Brief. Bioinform. 9 , 333–343 10.1093/bib/bbn024 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Huss J. W., 3rd, Orozco C., Goodale J., Wu C., Batalov S., Vickers T. J., Valafar F., Su A. I. (2008). A gene wiki for community annotation of gene function . PLoS Biol. 6 , e175 10.1371/journal.pbio.0060175 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Jenkinson A., Albrecht M., Birney E., Blankenburg H., Down T., Finn R., Hermjakob H., Hubbard T., Jimenez R., Jones P., Kahari A., Kulesha E., Macias J., Reeves G., Prlic A. (2008). Integrating biological data – the distributed annotation system . BMC Bioinformatics 9 , S3 10.1186/1471-2105-9-S8-S3 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Jurisica I., Mylopoulos J., Yu E. (2004). Ontologies for knowledge management: an information systems perspective . Knowl. Inf. Syst. 6 , 380–401 10.1007/s10115-003-0135-4 [ CrossRef ] [ Google Scholar ]
  • Kamel Boulos M. N., Wheeler S. (2007). The emerging Web 2.0 social software: an enabling suite of sociable technologies in health and health care education . Health Info. Libr. J. 24 , 2–23 10.1111/j.1471-1842.2007.00701.x [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kawas E., Senger M., Wilkinson M. D. (2006). BioMoby extensions to the Taverna workflow management and enactment software . BMC Bioinformatics 7 , 523 10.1186/1471-2105-7-523 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kiczales G., Lamping J., Mendhekar A. (1997). “Aspect oriented programming,” in The 11th European Conference of Object-Oriented Programming, LNCS 1241, Jyväskylä [ Google Scholar ]
  • Kitano H., Funahashi A., Matsuoka Y., Oda K. (2005). Using process diagrams for the graphical representation of biological networks . Nat. Biotechnol. 23 , 961–966 10.1038/nbt1111 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Komatsoulis G. A., Warzel D. B., Hartel F. W., Shanbhag K., Chilukuri R., Fragoso G., Coronado S., Reeves D. M., Hadfield J. B., Ludet C., Covitz P. A. (2008). caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability . J. Biomed. Inform. 41 , 106–123 10.1016/j.jbi.2007.03.009 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Krause F., Uhlendorf J., Lubitz T., Schulz M., Klipp E., Liebermeister W. (2010). Annotation and merging of SBML models with semanticSBML . Bioinformatics 26 , 421–422 10.1093/bioinformatics/btp642 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kushida T., Takagi T., Fukuda K. I. (2006). Event ontology: a pathway-centric ontology for biological processes . Pac. Symp. Biocomput. 11 , 152–163 10.1142/9789812701626_0015 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Laibe C., Le Novere N. (2007). MIRIAM Resources: tools to generate and resolve robust cross-references in systems biology . BMC Syst. Biol. 1 , 58 10.1186/1752-0509-1-58 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lam H. Y., Marenco L., Clark T., Gao Y., Kinoshita J., Shepherd G., Miller P., Wu E., Wong G. T., Liu N., Crasto C., Morse T., Stephens S., Cheung K. H. (2007). AlzPharm: integration of neurodegeneration data using RDF . BMC Bioinformatics 8 ( Suppl. 3 ), S4. 10.1186/1471-2105-8-S3-S4 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Langmead B., Schatz M. C., Lin J., Pop M., Salzberg S. L. (2009). Searching for SNPs with cloud computing . Genome Biol 10 , R134. 10.1186/gb-2009-10-11-r134 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Le Novere N. (2006). Model storage, exchange and integration . BMC Neurosci. 7 ( Suppl. 1 ), S11. 10.1186/1471-2202-7-S1-S11 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Le Novère N., Hucka M., Mi H., Moodie S., Schreiber F., Sorokin A., Demir E., Wegner K., Aladjem M. I., Wimalaratne S. M., Bergman F. T., Gauges R., Ghazal P., Kawaji H., Li L., Matsuoka Y., Villéger A., Boyd S. E., Calzone L., Courtot M., Dogrusoz U., Freeman Tom C., Funahashi A., Ghosh S., Jouraku A., Kim S., Kolpakov F., Luna A., Sahle S., Schmidt E., Watterson S., Wu G., Goryanin I., Kell D. B., Sander C., Sauro H., Snoep J. L., Kohn K., Kitano H. (2009). The systems biology graphical notation . Nat. Biotechnol 27 , 735–741 10.1038/nbt.1558 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lee T. J., Pouliot Y., Wagner V., Gupta P., Stringer-Calvert D. W. J., Tenenbaum J. D., Karp P. D. (2006). BioWarehouse: a bioinformatics database warehouse toolkit . BMC Bioinformatics 7 , 170 10.1186/1471-2105-7-170 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Li C., Courtot M., Le Novere N., Laibe C. (2009). BioModels.net web services, a free and integrated toolkit for computational modelling software . Brief. Bioinform. 11 , 270–277 10.1093/bib/bbp056 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lloyd C. M., Halstead M. D., Nielsen P. F. (2004). CellML: its future, present and past . Prog. Biophys. Mol. Biol. 85 , 433–450 10.1016/j.pbiomolbio.2004.01.004 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ludäscher B., Ludscher B., Altintas I., Berkley C., Higgins D., Jaeger E., Jones M., Lee E. A,, Tao J., Zhao Y. (2006). Scientific workflow management and the Kepler system . Concurr. Comput 18 , 1039–1065 10.1002/cpe.994 [ CrossRef ] [ Google Scholar ]
  • Maier C. W., Long J. G., Hemminger B. M., Giddings M. C. (2009). Ultra-Structure database design methodology for managing systems biology data and analyses . BMC Bioinformatics 10 , 254 10.1186/1471-2105-10-254 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Martin A. C. R. (2001). Can we integrate bioinformatics data on the internet? Trends Biotechnol. 19 , 327–328 10.1016/S0167-7799(01)01733-4 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Martin S., Hohman M. M., Liefeld T. (2005). The impact of life science identifier on informatics data . Drug Discov. Today 10 , 1566–1572 10.1016/S1359-6446(05)03651-2 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Matsuoka Y., Ghosh S., Kikuchi N., Kitano H. (2010). Payao: a community platform for SBML pathway model curation . Bioinformatics 26 , 1381–1383 10.1093/bioinformatics/btq143 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mazza T. (2010). Editorial: accelerating systems biology . Brief. Bioinform. 11 , 267–269 10.1093/bib/bbq012 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • McIlraith S. A., Son T. C., Honglei Z. (2001). Semantic web services . Intell. Syst., IEEE 16 , 46–53 [ Google Scholar ]
  • Miller J., Kunal V., Rajasekaran P., Sheth S., Agggarwal R., Sivashanmugam K. (2004). WSDL-S: Adding Semantics to WSDL – White Paper . Available at: http://lsdis.cs.uga.edu/library/download/wsdl-s.pdf
  • Mlecnik B., Scheideler M., Hackl H., Hartler J., Sanchez-Cabo F., Trajanoski Z. (2005). PathwayExplorer: web service for visualizing high-throughput expression data on biological pathways . Nucl. Acids Res. 33 , W633– W637 10.1093/nar/gki391 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Mons B., Ashburner M., Chichester C., van Mulligen E., Weeber M., den Dunnen J., van Ommen G. J., Musen M., Cockerill M., Hermjakob H., Mons A., Packer A., Pacheco R., Lewis S., Berkeley A., Melton W., Barris N., Wales J., Meijssen G., Moeller E., Roes P. J., Borner K., Bairoch A. (2008). Calling on a million minds for community annotation in WikiProteins . Genome Biol 9 , R89. 10.1186/gb-2008-9-5-r89 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Neerincx P. B., Leunissen J. A. (2005). Evolution of web services in bioinformatics . Brief. Bioinform. 6 , 178–188 10.1093/bib/6.2.178 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Nord R. L., Czarnecki K. (2004). Generative Software Development . Springer Berlin/Heidelberg: Product Lines, 148–151 [ Google Scholar ]
  • Noy N. F., Shah N. H., Whetzel P. L., Dai B., Dorf M., Griffith N., Jonquet C., Rubin D. L., Storey M.-A., Chute C. G., Musen M.A. (2009). BioPortal: ontologies and integrated data resources at the click of a mouse . Nucl. Acids Res. gkp440 , 37 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Oinn T., Greenwood M., Addis M., Alpdemir M. N., Ferris J., Glover K., Goble C., Goderis A., Hull D., Marvin D., Li P., Lord P., Pocock M.R., Senger M., Stevens R., Wipat A., Wroe C. (2006). Taverna: lessons in creating a workflow environment for the life sciences . Concurr. Comput. 18 , 1067–1100 10.1002/cpe.993 [ CrossRef ] [ Google Scholar ]
  • Pico A. R., Kelder T., van Iersel M. P., Hanspers K., Conklin B. R., Evelo C. (2008). WikiPathways: pathway editing for the people . PLoS Biol. 6 , e184 10.1371/journal.pbio.0060184 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Post L. J. G., Roos M., Marshall M. S., van Driel R., Breit T. M. (2007). A semantic web approach applied to integrative bioinformatics experimentation: a biological use case with genomics data . Bioinformatics 23 , 3080–3087 10.1093/bioinformatics/btm461 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Puustjarvi J., Puustjarvi L. (2009). The role of medicinal ontologies in querying and exchanging pharmaceutical information . Int. J. Electron. Healthc. 5 , 1–13 10.1504/IJEH.2009.026269 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Reddy M. P., Prasad B. E., Reddy P. G. (1994). A methodology for integration of heterogeneous databases . IEEE Trans. Knowl. Data Eng. 6 , 920–933 10.1109/69.334882 [ CrossRef ] [ Google Scholar ]
  • Ruttenberg A., Clark T., Bug W., Samwald M., Bodenreider O., Chen H., Doherty D., Forsberg K., Gao Y., Kashyap V., Kinoshita J., Luciano J., Marshall M. S., Ogbuji C., Rees J., Stephens S., Wong G. T., Wu E., Zaccagnini D., Hongsermeier T., Neumann E., Herman I., Cheung K. H. (2007). Advancing translational research with the Semantic Web . BMC Bioinformatics 8 ( Suppl. 3 ), S2 10.1186/1471-2105-8-S3-S2 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ruttenberg A., Rees J. A., Samwald M., Marshall M. S. (2009). Life sciences on the semantic web: the Neurocommons and beyond . Brief. Bioinform. 10 , 193–204 10.1093/bib/bbp004 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Safran M., Solomon I., Shmueli O., Lapidot M., Shen-Orr S., Adato A., Ben-Dor U., Esterman N., Rosen N., Peter I., Olender T., Chalifa-Caspi V., Lancet D. (2002). GeneCards 2002: towards a complete, object-oriented, human gene compendium . Bioinformatics 18 , 1542–1543 10.1093/bioinformatics/18.11.1542 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sagotsky J.A., Zhang L., Wang Z., Martin S., Deisboeck T. S. (2008). Life sciences and the web: a new era for collaboration . Mol. Syst. Biol. 4 , 201. 10.1038/msb.2008.39 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Saltz J., Oster S., Hastings S., Langella S., Kurc T., Sanchez W., Kher M., Manisundaram A., Shanbhag K., Covitz P. (2006). caGrid: design and implementation of the core architecture of the cancer biomedical informatics grid . Bioinformatics 22 , 1910–1916 10.1093/bioinformatics/btl272 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sayers E. W., Barrett T., Benson D. A., Bolton E., Bryant S. H., Canese K., Chetvernin V., Church D. M., Dicuccio M., Federhen S., Feolo M., Geer L. Y., Helmberg W., Kapustin Y., Landsman D., Lipman D. J., Lu Z., Madden T. L., Madej T., Maglott D. R., Marchler-Bauer A., Miller V., Mizrachi I., Ostell J., Panchenko A., Pruitt K. D., Schuler G. D., Sequeira E., Sherry S.T., Shumway M., Sirotkin K., Slotta D., Souvorov A., Starchenko G., Tatusova T. A., Wagner L., Wang Y., John Wilbur W., Yaschenko E., Ye J. (2010). Database resources of the national center for biotechnology information . Nucleic Acids Res. 38 , D5–D16 10.1093/nar/gkp967 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Schatz M. C., Langmead B., Salzberg S. L. (2010). Cloud computing and the DNA data race . Nat Biotechnol. 28 , 691–693 10.1038/nbt0710-691 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Shah S. P., Huang Y., Xu T., Yuen M. M., Ling J., Ouellette B. F. (2005). Atlas – a data warehouse for integrative bioinformatics . BMC Bioinformatics 6 , 34 10.1186/1471-2105-6-34 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Shannon P., Markiel A., Ozier O., Baliga N. S., Wang J. T., Ramage D., Amin N., Schwikowski B., Ideker T. (2003). Cytoscape: a software environment for integrated models of biomolecular interaction networks . Genome Res. 13 , 2498–2504 10.1101/gr.1239303 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Sheth A. P., Larson J. A. (1990). Federated database systems for managing distributed, heterogeneous, and autonomous databases . ACM Comput. Surv. 22 , 183–236 10.1145/96602.96604 [ CrossRef ] [ Google Scholar ]
  • Splendiani A. (2008). RDFScape: semantic web meets systems biology . BMC Bioinformatics 9 ( Suppl. 4 ), S6 10.1186/1471-2105-9-S4-S6 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Stein L. (2010). The case for cloud computing in genome informatics . Genome Biol. 11 , 207. 10.1186/gb-2010-11-5-207 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Stein L. D. (2003). Integrating biological databases . Nat. Rev. Genet. 4 , 337–345 10.1038/nrg1065 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Stevens R., Aranguren M. E., Wolstencroft K., Sattler U., Drummond N., Horridge M., Rector A. (2007). Using OWL to model biological knowledge . Int. J. Hum. Comput. Stud. 65 , 583–594 10.1016/j.ijhcs.2007.03.006 [ CrossRef ] [ Google Scholar ]
  • Stevens R., Paton N. W., Baker P., Ng G., Goble C. A., Bechhofer S., Brass A. (1999). “TAMBIS Online: a bioinformatics source integration tool,” in Proceedings of the 11th International Conference on Scientific and Statistical Database Management IEEE Computer Society.– [ Google Scholar ]
  • Tan W., Missier P., Foster I., Madduri R., De Roure D., Goble C. (2010). A comparison of using Taverna and BPEL in building scientific workflows: the case of caGrid . Concurr. Comput. 22 , 1098–1117 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Taylor I. J., Deelman E., Gannon D. B., Shields M., Taylor I., Wang I., Harrison A. (2007). The Triana Workflow Environment: Architecture and Applications . London: Springer, In Workflows for e-Science, 320–339 [ Google Scholar ]
  • Tyrelle G. (2005). The Semantic Web for Life Sciences Now! Available at: http://archive.nodalpoint.org/node/1704
  • Van Hemert J. L., Dickerson J. A. (2010). PathwayAccess: celldesigner plugins for pathway databases . Bioinformatics . [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • W3C. W3C (2002). Web Services Activity group Available at: http://www.w3.org/2002/ws/
  • W3C/CWI (2010). XForms Available at: http://www.w3.org/MarkUp/Forms/
  • Walton N. A., Brenton J. D., Caldas C., Irwin M. J., Akram A., Gonzalez-Solares E., Lewis J. R., Maccallum P. H., Morris L. J., Rixon G. T. (2010). PathGrid: a service-orientated architecture for microscopy image analysis . Philos. Transact. R. Soc. A Math. Phys. Eng. Sci. 368 , 3937–3952 [ PubMed ] [ Google Scholar ]
  • Wang X., Gorlitsky R., Almeida J. S. (2005). From XML to RDF: how semantic web technologies will change the design of “omic”standards . Nat. Biotechnol. 23 , 1099–1103 10.1038/nbt1139 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wang Z., Gao X., He C., Miller J. A., Kissinger J. C., Heiges M., Aurrecoechea C., Kraemer E.T., Pennington C. (2007). “A comparison of federated databases with web services for the integration of bioinformatics data,” in Conference on Bioinformatics and Computational Biology (BIOCOMP), Vegas, NV [ Google Scholar ]
  • WikiPathways (2008). GPML plugin for Cytoscape . Available on: http://www.pathvisio.org/wiki/Cytoscape_plugin
  • Wilkinson M., Links M. (2002). BioMOBY: an open source biological Web services proposal . Brief. Bioinformatics 3 , 331–341 [ PubMed ] [ Google Scholar ]
  • Wilkinson M., Schoof H., Ernst R., Haase D. (2005). BioMOBY successfully integrates distributed heterogeneous bioinformatics web services. The PlaNet exemplar case . Plant Physiol. 138 , 5–17 10.1104/pp.104.059170 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Wood J. D. (2004). The first nobel prize for integrated systems physiology: Ivan Petrovich Pavlov, 1904 . Physiology 19 , 326–330 10.1152/physiol.00034.2004 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zdobnov E. M., Lopez R., Apweiler R., Etzold T. (2002). The EBI SRS server – recent developments . Bioinformatics 18 , 368–373 10.1093/bioinformatics/18.2.368 [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Zhang Z., Cheung K.-H., Townsend J. P. (2009). Bringing web 2.0 to bioinformatics . Brief. Bioinform. 10 , 1–10 10.1093/bib/bbn041 [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]

Stop Thinking, Just Do!

Sungsoo Kim's Blog

Tags Categories Archive

Sung-Soo Kim's Blog

Research topics in database management.

  • data management 287

29 January 2016

Research topics in database management systems, course description.

This seminar focuses on recent research results in the intersection of data management and systems. There is no formal textbook for this course. We will mostly be reading and discussing recently published papers in venues such as SIGMOD, VLDB and ICDE. An important component of the course is an individual research project, where you will pick one topic of interest in the area of database management systems and explore it in depth.

This course mainly discusses the latest research findings on data management and builds on the foundations that have been introduced in the CSE 5242, the Advanced Database Management Systems course. If you are not motivated to study and conduct independent research, this course does not have a structure to guide you to success (such as a textbook, exams, or help from a GTA).

Paper summaries

In order to make the most of our in-class time, you are expected to submit a summary of the assigned reading before each class. For all questions, don’t paraphrase (or copy verbatim) what is written in the paper. Papers frequently have different contributions than their authors claimed when they were writing them.

Paper summaries will be graded on a scale from zero to two. Zero is reserved for summaries that have not been submitted or are unreadable. One reflects a summary that can be improved, either for length, clarity or insight. Two represents a solid effort at summarizing the paper. One bonus point will be given to a few exceptionally insightful summaries.

Each summary must answer exactly the following questions. Remember that summaries are graded on clarity and insight, and not their length!

  • What is your name?
  • What is the paper you are summarizing?
  • What problem was this paper addressing?
  • What was the existing solution to this problem?
  • What solution was this paper proposing?
  • What are the conclusions you draw from the results?
  • List three things you appreciated when reading this paper.
  • List three things you believe can be improved in this paper.

Answers to all questions are due by 1am on the day the paper is discussed . Upload your answers to Carmen as a single plaintext file. Please include the questions in the submitted file. No Microsoft Word or Adobe Acrobat files will be accepted.

Class project

You will also work in an individual research project at a topic of mutual research interest. (Group projects will not be allowed.) I can provide a list of ideas on interesting topics and discuss about any ideas you have.

It is your responsibility to meet with the instructor periodically throughout the semester to discuss the general direction and the progress of the class project. You must take the initiative to actively explore the topic you choose, or else you will not accomplish much in the project. As a consequence, your class project grade will be adversely impacted.

Final report: The final project report should be at most twelve pages of text and figures in 11-point font. This includes any references to publications, URLs, manuals, etc. I will be looking for answers to the following questions:

  • What is the problem you are solving?
  • What have others done already to solve this or a similar problem?
  • What is your solution, and what did you accomplish during the last three months?
  • What are the results? Does your solution improve over what prior work has already accomplished?
  • In retrospective, what could you have done better in this project?
  • If someone else looks at this problem in the future, what are the aspects of the problem that you did not have time to explore?

Source code: Before submitting your source code, please delete any intermediate files and executable binaries. (These will not work in any other platform but your own system.) If you have worked with a large codebase (PostgreSQL, Impala, MySQL, etc.) please only submit a diff of your changes, and include a reference to what is the “base” version you modified. Examples include “PostgreSQL 9.4.0”, or “Linux 3.x development branch, git commit f3f62a38ce”.

If the source code is small (a few MBs), please upload it with your report on Carmen. Ohio State offers BuckeyeBox , a version of the Box file sharing service, for this purpose which you can access using your Carmen credentials. It is not necessary to use this service, as long as you include a link to your source code.

Sungsoo Kim Principal Research Scientist [email protected]

about me sungsoo's scoop sungsoo's facebook

78 MIS Topics for Presentations and Essays

🏆 best management information systems project ideas, 🎓 interesting topics related to information systems, ✅ simple & easy mis assignment topics.

  • Management Information System Implementation in the Bank This conforms to the first principle of change in which a person is adjusted via a change in the system that they work in.
  • Management Information Systems and Enterprise Resource Planning In addition to heavy investment in the staff who left, their departure led to delay in the areas they were in charge of as well as repeating some of the steps already done during the […] We will write a custom essay specifically for you by our professional experts 808 writers online Learn More
  • Management Information System: Operational Efficiency and Decision-Making The customers as well are in a position to be aware of the status of their deliveries by logging in to the company’s website which is updated by the servers throughout.
  • Management Information Systems Major: Courses and Careers Knowledge on Management information systems is vital to institutions on a management height, where it is employed to preserve and build up new techniques for organizing vast amounts of information and helping managers in the […]
  • Management Information Systems in Organizational Performance The information system has enabled the organisation to solve problems like inappropriate use of time, increased expenditure, and customer dissatisfaction. Management information system is an important tool that can be used to shift the cost […]
  • Management Information Systems: Mitsubishi Motors A management information system is considered as one of the most effective and successful systems that are able to provide the necessary information in order to promote the development and management of any organization in […]
  • Management Information Systems and E-Government In the developing countries, it has been of much surprise to notice that, the failures of e-government project, is a problem that is real and much practical.
  • Management Information System in Business The main importance of information system to any modern organization is to store its data and that of its associates and customers in a secure manner.
  • Samsung Company’s Management Information System The scope of Management Information System is defined as, “The combination of human and computer based resources that results in the collection, storage, retrieval, communication and use of data for the purpose of efficient management […]
  • Management Information Systems Types: Functions and Importance A TPS is vital in managing the system resources as it maintains a pool of operating resources that are used in transaction processing, application program loading, and acquiring and releasing storage.
  • Management Information Systems and Business Decision-Making The article explains to its audience the importance of promoting and adapting the use of information systems to ensure that managers get the latest information in time.
  • Management Information Systems: LinkedIn Corporation It highlights how information technology has been used in management, the general operations of the organization as well as how the use of information systems has helped the organization to attain a competitive edge.
  • Types of Management Information Systems in Business Generally, a TPS is used to process the data that is required to update the records about the operations of a business.
  • Management Information Systems Analysis and Design The progress of this project will be based on a simple definition of a management information system which would be: a computer based system that provides flexible and speedy access to accurate data.
  • Management Information Systems: Effective Decision-Making and Security Through taking into account the different organizational levels within an organization management information systems are classified into four main types, namely, operational level systems, knowledge level systems, management level systems and strategic level systems. Management […]
  • Management Information Systems: Making Strategic Decisions The company will create a model of the relationship between all the pieces of information in the group. In this regard, the organization employs MISs in order to complete and integrate a series of elements […]
  • Management Information Systems in Corporate Institutions With the invention of personal computers and other information technology tools, the companies had to develop a proper information technology system that would handle the work of the organization and reduce the errors that were […]
  • Management Information System: Cisco Systems Prior to the implementation of the ERP system, the company’s systems were on the brink of failure. The management of the company understood the need for the company to shift to a new ERP system.
  • The Role of Management Information System (MIS) in Business The diagram below shows the relationship between the departments and underpins how the manual system which is used to conduct the primary and secondary activities within the departments is related to the performance of each […]
  • ABC Company Management Information System Increasing the presence of the firm’s products to specific segments of clients provides the customers with seamless shopping experience in the business’s physical and online stores.
  • Management Information Systems Benefits in Business This has helped this firm to achieve competitive advantage in the market because it is always aware of the needs of its customers. To manage this threat, ABC has discounted and differentiated its products in […]
  • Management Information System and Strategic Performance According to his assumption, the higher the demographic diversity in top management team, the greater the contribution of accounting system to strategic performance.
  • Relevant Decision Making: Management Information Systems in Organizations In this respect, managers are likely to make wrong decisions, especially, if they are unaware of the inaccuracy of the information provided by the system.
  • Fly Dubai Company’s Management Information Systems Data from the company’s website and its associated pilot training website outline the main sources of primary information. Identity refers to the ease that websites explain the nature, history, and values of a company.
  • Management Information Systems: Efficiency and Collaboration In addition, it is important to stress out that Microsoft Access allows a more flexible retrieval of data even when the volume of data gets high.
  • Management Information Systems and Its Impacts As thus, it is the obligation of the employees so see to it that they acquire the necessary knowledge and skills; otherwise, they will be washed out of the company system.
  • Management Information Systems: Socio-Technical Aspect Software: This component stands for programs that are used to operate the MIS, manage data, search and cipher through logs, and other related activities.
  • “Management Information Systems” by James O’Brien and George M. Marakas This is a network or sub-network with a high speed that interconnects different types of data storage devices that have associated data servers on behalf of a larger network of users. Through this, data can […]
  • Management Information System and Outsourcing According to these critics, there is a need for some of the currently outsourced services to be performed in the home country.
  • Management Information Systems: Ethics and Career Path The second one is the group of skills necessary to vivificate information, and the last one is meant to reason in a proper.
  • Imperial Tobacco. Management Information System – Competitive Forces This means that the management at Imperial Tobacco needs to develop products that can compete with the new products for them to maintain their position in the market.
  • Management Information Systems: Primis Online System at McGraw Hill This paper focuses on the analysis, design and system development elements applied by the Primis team in deployment of the online system at McGraw Hill.
  • Accounting and Management Information Systems This article is a discussion of the results obtained by Mangiuc in an empirical study that involved both local and foreign companies in Romania.
  • Healthcare Management Information Systems: An Evaluation In this perspective, the Chief Information Officer survey therefore becomes important for the Health Management Information System industry because it assist health institutions to project current and future informational and technological needs, not mentioning the […]
  • Healthcare Management Information Systems: Working Principles For instance, the ministry of health uses the network to disseminate health information to people in all regions and also globally.
  • Chalhoub Group: Management Information Systems This presentation will focus on one organization in UAE, highlighting how its improved IS/IT systems have helped it register massive profits.
  • Management Information Systems (MIS) The advances in the evolution of devices and the achievement of a new stage of development critically impacts MIS and creates the basis for the emergence of multiple changes towards the achievement of better outcomes […]
  • Health Management Information Systems: Impact on the Technology Implementation Since the beginning of the information systems implementation, the vast majority of spheres have adopted some cutting-edge technologies to increase the effectiveness of their working process.
  • Bespoke Management Information Systems Using Microsoft Access
  • Management Information System in Starbucks: IBM TPS System
  • Logistics Management Information Systems: Functions, Components, Examples
  • The Management Information Systems of Toyota: New Methods and Accomplish Business Goals
  • Management Information Systems for Shipping and Delivery Company
  • Management Information Systems of the Small and Medium Enterprises
  • Management Information Systems in Marketing: Kotler’s Model
  • Barriers to Successful Development of Strategic Management Information System
  • Management Accounting Information System: Auditing and Financial Reporting Modules
  • Warehouse Management Information System: Optimizing the Use of Available Space or Coordinating Tasks
  • Management Information Systems in Hospitals: Accounting for the Control of Doctors
  • Management Information Systems Through User Interface
  • Project Management Information System: Using More Efficiently, Without Getting Overwhelmed With Data
  • Information Management Systems in the Supply Chain
  • How Management Information Systems Affect Working Ethics
  • Human Resource Management System: The Best Tools in 2022
  • Management Information System for Real Estate and Property Management
  • Management Information System: Advantages and Disadvantages
  • The Technology of Information Management System
  • Management Information Systems for Computer-Aided Design
  • Management Information Systems: Enterprise Applications
  • The History of Management Information Systems: Five Eras
  • Management Information System: Development Process With System Development Life Cycle
  • Credit Management Information Systems: A Forward-Looking Approach
  • Common Problems in Management Information Systems
  • Management Information Systems at Rosenbluth Travel: Competitive Advantage in a Rapidly Growing Global Service Company
  • Why Can Management Information Systems Effectiveness Decreases
  • Management Information Systems: The Difference Between Advanced MIS and MI Dashboard
  • Developing Decision Support Capabilities Through the Use of Management Information Systems
  • Using National Education Management Information Systems to Make Local Service Improvements: The Case of Pakistan
  • How Might a Management Information System Be Used in a School
  • The External Organizational Environment and Its Impact on Management Information Systems
  • Management Information Systems: Managing the Digital Firm by Kenneth Laudon, Jane Laudon
  • The Disadvantage of Management Information System: Fraudulent Activities
  • Management Information Systems: Impact on Dairy Farm Profitability
  • Which Country Is Best in Management Information System
  • Management Information Systems Program for Poughkeepsie Children’s Home
  • Relationship Between Management Information Systems and Corporate Performance
  • Management Information Systems: Air Canada Takes off With Maintenix
  • Farm Management Information Systems Planning and Development in the Netherlands
  • Chicago (A-D)
  • Chicago (N-B)

IvyPanda. (2023, November 30). 78 MIS Topics for Presentations and Essays. https://ivypanda.com/essays/topic/management-information-systems-essay-topics/

"78 MIS Topics for Presentations and Essays." IvyPanda , 30 Nov. 2023, ivypanda.com/essays/topic/management-information-systems-essay-topics/.

IvyPanda . (2023) '78 MIS Topics for Presentations and Essays'. 30 November.

IvyPanda . 2023. "78 MIS Topics for Presentations and Essays." November 30, 2023. https://ivypanda.com/essays/topic/management-information-systems-essay-topics/.

1. IvyPanda . "78 MIS Topics for Presentations and Essays." November 30, 2023. https://ivypanda.com/essays/topic/management-information-systems-essay-topics/.

Bibliography

IvyPanda . "78 MIS Topics for Presentations and Essays." November 30, 2023. https://ivypanda.com/essays/topic/management-information-systems-essay-topics/.

  • Risk Assessment Questions
  • Auditing Paper Topics
  • Business Intelligence Research Topics
  • Competitive Strategy Research Ideas
  • Data Mining Titles
  • Construction Management Research Topics
  • Digital Transformation Topics
  • Cyber Security Topics
  • Encryption Essay Titles
  • Hacking Essay Topics
  • Information Management Paper Topics
  • Quality Control Research Topics
  • Security Management Essay Ideas
  • Virtualization Essay Titles
  • Software Engineering Topics

COMMENTS

  1. 10 Current Database Research Topic Ideas in 2024

    This is where database topics for research paper [7] come in. By using database technology in video surveillance systems, it is possible to store and manage large amounts of video data efficiently. Database management systems (DBMS) can be used to organize video data in a way that is easily searchable and retrievable.

  2. 19024 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on DATABASE MANAGEMENT SYSTEMS. Find methods information, sources, references or conduct a literature ...

  3. PDF Database management system performance comparisons: A systematic

    the database is used by one or several software applications via a DBMS. Collectively, the database, the DBMS, and the software application are referred to as a database system [31, p.7][17, p.65]. The separation of the database and the DBMS, especially in the realm of relational databases, is typically impossible without exporting the database ...

  4. Research Area: DBMS

    Berkeley also gave birth to many of the most widely-used open source systems in the field including INGRES, Postgres, BerkeleyDB, and Apache Spark. Today, our research continues to push the boundaries of data-centric computing, taking the foundations of data management to a broad array of emerging scenarios.

  5. Advances on Data Management and Information Systems

    This editorial paper overviews research topics covered in this special section of the Information Systems Frontiers journal. The special section contains papers invited from the 24 th European Conference on Advances in Databases and Information Systems (ADBIS).. 3.1 ADBIS Research Topics. The ADBIS conference has been running continuously since 1993.

  6. PDF Architecture of a Database System

    in database systems that may arise in the future. As a result, we focus on relational database systems throughout this paper. At heart, a typical RDBMS has five main components, as illustrated in Figure 1.1. As an introduction to each of these components and the way they fit together, we step through the life of a query in a database system.

  7. CS 764 Topics in Database Management Systems

    The topics discussed include query processing and optimization, advanced access methods, advanced concurrency control and recovery, parallel and distributed data systems, implications of cloud computing for data platforms, and data processing with emerging hardware. The course material will be drawn from a number of papers in the database ...

  8. CSE 5249

    SEP 28 UPDATE: Small changes about who leads each paper discussion in the schedule. NOV 3 UPDATE: Added paper summaries for remaining papers; please find instructions below. Assigned presentation slots. Course description. This seminar focuses on recent research results in the intersection of data management and systems.

  9. Advances in Databases and Information Systems

    Needless to say, these four papers represent innovative and high quality research. The topics of these accepted papers are very timely and include: Big Data Applications and Principles, Evolving Business Intelligence Systems, Cultural Heritage Preservation and Enhancement and database evolution management.

  10. PDF Database Management Systems: A Case Study of Faculty of Open Education

    Database systems continue to be a key aspect of Computer Science & Engineering today. Representing knowledge within a computer is one of the central challenges of the field. Database research has focused primarily on this fundamental issue (6). This paper presents a database management system developed for AOF (Faculty of Open Education) course ...

  11. Relational data paradigms: What do we learn by taking the materiality

    Systems administrators and computer science textbooks may expect databases to be instantiated in a small number of technologies (e.g., relational or graph-based database management systems), but there are numerous examples of databases in non-conventional or unexpected technologies, such as spreadsheets or other assemblages of files linked ...

  12. 67 Data Management Essay Topics & Database Research Topics

    Interested in data management systems? 🌐 Check out this list of database research topics! Find here ideas on DBMS security and design other database topics for research papers. ... Find here ideas on DBMS security and design other database topics for research papers. Free essays. Search for: Close and clear the search form. Search. Tools ...

  13. Advances in database systems education: Methods, tools, curricula, and

    Research papers written in English language are included: EC: ... It is mainly because of its ability to handle data in a relational database management system and direct implementation of database theoretical concepts. Also, other database topics such as transaction management, application programming etc. are also the main highlights of the ...

  14. (PDF) A Literature Review on Evolving Database

    It summarized the recent technologies that gave a new world and new research areas to database. A long journey from RDBMS to NewSQL [16] evolved terms like Google spanner and Polygot persistence ...

  15. Research Topics

    Data science is a field that crosscuts many research area of computer science, such as artificial intelligence, machine learning, data mining, databases, and information systems. Our research falls into the last two of these areas and aims at supporting data science at the system level. Data science requires the management of new types of data ...

  16. CS 764 Topics in Database Management Systems

    The topics discussed include query processing and optimization, advanced access methods, advanced concurrency control and recovery, parallel and distributed data systems, cloud computing for data platforms, and data processing with emerging hardware. The course material will be drawn from a number of papers in the database literature.

  17. GitHub Pages

    Overview. Data management systems are the corner-stone of modern applications, businesses, and science (including data). If you were excited by the topics in 4111, this graduate level course in database systems research will be a deep dive into classic and modern database systems research.

  18. (PDF) Role of Database Management Systems (DBMS) in Supporting

    In the realm of Database Management Systems (DBMS), course curriculum covers several topics that range from data modeling to data implementation and examination.

  19. 40 List of DBMS Project Topics and Ideas

    Technology made it easier for people to accomplish daily tasks and activities. In the conventional method, customers avail themselves of services by visiting the shop that offers their desired services personally. 40 List of DBMS Project Topics and Ideas. Fish Catch System Database Design.

  20. Current Trends and New Challenges of Databases and Web Applications for

    Progress in systems driven research [e.g., systems biology, physiome, systems physiology systems pharmacology, virtual physiological human (VPH), personal health systems, life science e-infrastructures] is significantly driven by development of suitable computational infrastructure including tools and information resources.

  21. Research Topics in Database Management Systems

    Course description. This seminar focuses on recent research results in the intersection of data management and systems. There is no formal textbook for this course. We will mostly be reading and discussing recently published papers in venues such as SIGMOD, VLDB and ICDE. An important component of the course is an individual research project ...

  22. 78 Management Information System Topics for Presentation and Essays

    The scope of Management Information System is defined as, "The combination of human and computer based resources that results in the collection, storage, retrieval, communication and use of data for the purpose of efficient management […] Management Information Systems Types: Functions and Importance.

  23. Database Management System

    A database management system (DBMS) is software that helps you create and maintain databases. Today's commonest DBMS technology, termed relational database management systems (RDBMSs), was conceived by the computer scientist Edgar F. Codd, then at IBM, in a seminal, if hard-to-follow, 1970 paper [3]. Codd's penchant for mathematical ...