University of Siegen, Siegen, Germany
+49 (0) 271 - 740 3327
contact@bridgeus.de

Interviews

Big Data and Artificial Intelligence Made in Germany inspiring the US

Prof. Maria-Esther Vidal

head of the Scientific Data Management Research Group at TIB and a member of the L3S Research Centre at the University of Hannover

Prof. Maria-Esther Vidal is the head of the Scientific Data Management Research Group at TIB and a member of the L3S Research Centre at the University of Hannover; she is also a full professor (on-leave) at Universidad Simón Bolívar (USB) Venezuela. Her interests include Big data and knowledge management, knowledge representation, and the semantic web.  She has published more than 180 peer-reviewed papers in Semantic Web, Databases, Bioinformatics, and Artificial Intelligence. She has co-authored one monograph, and co-edited books and journal special issues. She is part of various editorial boards (e.g., JWS, JDIQ), and has been the general chair, co-chair, senior member, and reviewer of several scientific events and journals (e.g., ESWC, AAAI, AMW, WWW, KDE). She is leading data management tasks in the EU H2020 projects iASiS, BigMedylitics, CLARiFY, PLATOON, and QualiChain, and has participated in BigDataEurope, BigDataOcean; she is a supervisor of MSCA-ETN projects WDAqua and NoBIAS. She has been a visiting professor in different universities (e.g., Uni Maryland, UPM Madrid, UPC, KIT Karlsruhe, Uni Nantes). In the past, she has participated in international projects (e.g., FP7, NSF, AECI), and led industrial data integration projects for more than 10 years (e.g., Bell South, Telefonica).

maria.vidal@tib.eu / Twitter: @MEVidalSerodio

 

  1. Could you provide examples of Big data? 
  • Data from domains like Biomedicine, Maritime, Scholarly Communication, Energy
  1. Which of the dominant dimensions of Big data, e.g., volume, veracity, velocity, characterized this data?
  • Veracity, variety, and volume are the most dominant dimensions of Big data

 

  1. What are the main challenges for Big data management?
  • Data heterogeneity and curation 
  1. What is the main focus of your work: Big data management or analytics?
  • My main focus is on semantic data integration, knowledge graph creation, and analytics
  1. Which tools have your team developed to address the challenges and fulfil the Big data requirements?
  • We have developed an ecosystem system of tools for creating and managing knowledge graphs
  1. What are the main contributions of your in-house tools to the area of Big data?
  • Data transparency and management traceability
  1. Which are problems that still remain open?
  • Tracing the whole of the pipeline of transforming Big data into actionable  knowledge  
  1. How do you compare yourself with your competitors in the business or research in the US? 
  • At research level, we are all addressing the challenge of responsible management of Big data in a way that legal and ethical regulations are ensured.
  1. Is there any other point/comment/question that you would like to add? 
  • Managing Big data provides us with great power, but it also comes with great responsibility. Thus, techniques to ensure that data is processed according to ethical guidelines and national and international legal regulation are demanded. 

PART II: 

Consider now the general context:

  1. What are the grand challenges that the area of Big data demands to face?
  • Heterogeneity and data quality are two dimensions of Big data that considerably affect the effectiveness of a data-driven solution.  In addition  to effectively solving these problems, it is also necessary that all the decisions are traceable in the way that satisfaction of integrity constraints, data protection regulations, and ethical principles are verifiable.
  1. What is the impact of efficiently managing Big data on AI? 
  • Efficiency is a crucial aspect. However, I would consider that ensuring fairness and high-quality are the most crucial and challenging  aspects to ensure while Big data is used in AI and the results are trustable.
  1. Which are the application areas that can be more benefited from scaling up to Big data? 
  • Many areas can be benefited, some few are Biomedicine, Scholarly Communication, Energy, and Maritime.

 

Prof. Dr. Mahdi Bohlouli

CEO of research and development of Petanux GmbH based in Bonn in Germany

Prof. Dr. Mahdi Bohlouli is currently CEO of research and development SME based in Bonn in Germany and Assistant Professor for data science and machine learning at the Institute for Advanced Studies in Basic Sciences (IASBS) in Zanjan with the main focus on applied deep learning and big data, especially in tackling misinformation (fake-news), identification of social bots and applied generative adversarial networks in Misinformation processing. Mahdi served a Program Committee (PC) as well as co-chair of workshops in numerous highly qualified conferences such as ISC, ADBIS, FiCloud. He is also a general co-chair and member of the steering committee at the CiDaS 2019 (The International Conference On Contemporary Issues in Data Science) and guest editor of Topical Collection “Data Science, Big Data and Applied Deep Learning: From Science to Applications” at Springer Nature Applied Sciences Journal. Mahdi has been invited as a reviewer to high scholarly journals such as Information Processing & Management (Elsevier), Information Processing Letters (Elsevier), Pattern Recognition Letters (Elsevier).

 

mb@petanux.com // Twitter: @m_bohlouli

  1. Could you provide examples of Big data?

Search Engines Data, Urban Planning and Traffic Management Data, Wiki and Web Data, Social Networks Data

  1. Which of the dominant dimensions of Big data, e.g., volume, veracity, velocity, characterized this data?

It depends to the type of each dataset. Each dataset can have either one or more dimensions of Vs concept. For example, Wiki and Web data have Volume and variety. On the other hand, Traffic Management data has Volume and Velocity characteristics.

  1. What are the main challenges for Big data management?

How to create the value from existing big data. Being able to only store and collect the data is not enough.

  1. What is the main focus of your work: Big data management or analytics?

Analytics. Scalable AI algorithms in the Big Data perspective.

  1. Which tools have your team developed to address the challenges and fulfil the Big data requirements?

Spark based Tools and Scalable AI algorithms.

  1. What are the main contributions of your in-house tools to the area of Big data?

Efficient and faster knowledge discovery from big data.

  1. Which are problems that still remain open?

Concurrent support of various data types (Variety)

  1. How do you compare yourself with your competitors in the business or research in the US? 

Our already developed algorithms and methods outperform developed state-of-the-art methods, specially ones developed in the first class US academia. We still lack the exploitation of our results to the business and industry, which is in progress and already planned.

  1. Is there any other point/comment/question that you would like to add? 

No

PART II: Consider now the general context:

  1. What are the grand challenges that the area of Big data demands to face?

Data Privacy and Ethics

  1. What is the impact of efficiently managing Big data on AI? 

As already told, the best and most important thing in Big Data management is to be able to create value from the big data and this is possible only by means of AI.

 

 

Dr. Johannes Winter

Managing Director at Lernende Systeme – Germany’s Artificial Intelligence Platform & Head of Technology Department at acatech – National Academy of Science and Engineering

Johannes Winter is Managing Director of “Lernende Systeme – Germany’s Artificial Intelligence Platform” and Head of Technology at acatech – National Academy of Science and Engineering. Previously he was the Head of Economic & Science Relations and Personal Assistant to Professor Henning Kagermann, the father of Industrie 4.0. Johannes Winter holds a PhD in regional economics from University of Cologne and has served as lecturer at University of Applied Sciences and Economics in Munich. His prior work experience includes positions in automotive industry, consulting as well as academic research.

Contacts

E-Mail: winter@acatech.de

1. How is digital transformation changing the economy?

In history, industrial production was transformed several times: first by steam power, then by electricity and nearly 50 years ago by automation. Over the last decade significant progress was made in various fields like microelectronics, robotics, photonic, machine learning, cloud computing and real-time analytics which led to the next industrial revolution. Some years ago, my colleagues from acatech – National Academy of Science and Engineering looked for solutions to secure Germany´s competitiveness and published in this context the concepts so called Industrie 4.0. What does it mean? In a nutshell, the fourth industrial revolution, or Industrie 4.0, can be characterized by three acronyms: smart, hyperconnected and autonomous. The Internet of Things is entering the factory connecting smart products, smart machines and worker equipped with smart devices.

2. Why is big data important?

In this context data become more and more independent economic goods, have a value and are base of innovative and profitable business models. Once they have left factory, smart products are still connected via the internet and exchange massive volumes of data during their use. These big data are refined into smart data, which can then be used to control, maintain or enhance and improve smart products and services. They generate the knowledge that forms the basis of new business models.
The consolidation and refinement via real-time analytics and artificial intelligence is usually done in data-rich digital platforms, which will soon be the predominant marketplace.

3. Which are the most significant challenges and opportunities related to big data technologies in your opinion?

Quite a few companies have already connected smart products to the internet and have started collecting and evaluating data. Ideally those platforms should combine device management with easy connectivity, data storage systems and an App Store open for customized data-driven services provided by an open digital ecosystem. The quality of the digital innovation ecosystem and how fast it can be established will be crucial for a successful implementation of new digital business models. In addition, several challenges must be answered regarding financing, reliability, data security, IPR-protection, and finally standardization.

4. How do big data technologies impact the Future of Work and industry 4.0?

The digital transformation will enable companies to react faster and more precisely to changing customer needs and new market conditions. It is already well understood that a fast implementation of data-based business models and a high level of flexibility, adaptability, and willingness to change among organizations and its employees are crucial for success in the face of global competition. Key factors in the successful introduction of big data include the acceptance of new technologies by employees and the design of attractive forms of work. At the same time, the higher degree of flexibility, in turn, opens the opportunity for workers to also achieve a higher level of work-life-balance and to safeguard their long-term employability by personalized re- and up-skilling measures. In this context, the ability of workers to learn (and retrain) throughout the span of their careers is key to ensuring their future employability (lifelong learning). Companies share the responsibility by providing the corresponding education and training, and their employees obviously benefit from these measures.

5. In your opinion how can big data be combined with AI and what can be the resulting benefits?

Data are receiving a monetary value – which is what inspires some to speak of the data economy or data capitalism. The required data are merged, analyzed, and interpreted on digital, usually cloud-based technology platforms, with the help of AI and machine-learning methods and tools. Autonomous software systems such as self-learning robot advisers or assistance systems contribute to a personalized and convenient user experience. Reconfiguration is no longer a manual process but autonomous and dynamic. This provides us with highly adaptable processes on all organizational levels for the first time: from the factory floor to the business level, which is often referred to as a new wave of business process reengineering. As a result, the collection and use of data will become omnipresent. Self-learning and autonomous systems driven by artificial intelligence use that to make independent decisions, also building on their own learning processes. These developments represent a challenge, but above all an opportunity for Germany and the US. The guiding principle of action here should be that digitalization is primarily shaped by people, for people. In order to design self-learning systems according to the needs of humans and society, the German Federal Ministry of Science and Education (BMBF) and acatech have launched Germany’s AI Platform “Lernende Systeme” in 2017. The Platform brings together leading expertise from science, industry and society and consolidates the current state of knowledge about big data and artificial intelligence. They point out developments in industry and society, analyze the skills which will be needed in the future and use real application scenarios to demonstrate the benefit of data-based systems.

6. Would you be interested in collaborating with USA institutions (companies and academia) for projects in the domain of big data?

Yes, of course. Innovation needs stronger collaboration, between scientific disciplines, all societal groups and on an international level, in particular when it comes to common regulation, industrial standards and markets. Just think of new data-driven business models. They will not be created by single companies but in digital ecosystems cross different industries and countries with large multinationals as well as SMEs and start-ups. In any case, international networking and open co-innovation are the critical success factors for the innovation system of the future.

 


Prof. Dr. Jens Lehmann

"Smart Data Analytics" Research Leader and Full Professor at the University of Bonn Lead Scientist of Enterprise Information Systems Department at Fraunhofer IAIS Leader of SDA Competence Center of Institute for Applied Informatics at the University of Leipzig

Prof. Dr. Jens Lehmann leads the „Smart Data Analytics“ research group at the University of Bonn with 40 researchers. He is a lead scientist at Fraunhofer IAIS in the Enterprise Information Systems department and is head of the Dresden branch of Fraunhofer IAIS. His research interests involve semantic web, machine learning, question answering & dialogue systems, distributed computing and knowledge representation. Prof. Lehmann authored more than 120 articles in international journals and conferences, 50 of those in A level venues winning 12 best paper awards. His articles are cited more than 16000 times. In several major conferences and journals, he has leading positions. He is founder, leader or contributor of several community research projects, including SANSA, AskNow, DL-Learner, DBpedia and LinkedGeoData. Previously, he completed his PhD with „summa cum laude“ at the University of Leipzig with visits to the University of Oxford.

Contacts

E-Mail: jens.lehmann@cs.uni-bonn.de

 

1. How are big data technologies developed and used in your institution?

 

We are making use of established frameworks for big data and machine learning, e.g. Apache Spark, Apache Flink or PyTorch. The tools developed in our groups focus on analysing and using knowledge graphs. To give some examples: SANSA is our framework for scalable knowledge graph analytics, AskNow is developed as an umbrella project for our conversational AI efforts over knowledge graphs and PyKEEN is a framework for knowledge graph embeddings. We are a strong supporter of reproducible research. Therefore, most of the code for the above research is open source and the data/ results are made public whenever possible. These are being used by multiple use cases (both academic and industrial) coming from multidisciplinary areas.

 

2. Give us an overview of your (past and current) projects related to big data technologies (incl. website links).

 

Main community project:

     SANSA (Github)

 

Funded projects:

     SLIPO

     Better

     BOOST

     PLATOON

     Big Data Europe

     Big Data Ocean

     SPEAKER

 

a. Which are the application areas of your big data tools? (e.g. health, energy, manufacturing etc.)

 

We mainly develop generic tools to handle a variety of application areas.

 

In our project “Big Data Europe”, we developed a scalable platform for Big Data workflows. Seven applications coming from important societal challenges (specifically Health, Security, Food and Agriculture, Society, Transport, Energy, and Climate) were developed and executed over the platform.

The project “Big Data Ocean” dealt with analytics for the heterogeneous data coming from Maritime domains with applications like anomaly detection in ship routes and proactive maintenance.

In SLIPO, we are using SANSA to cluster points of interests based on geolocation data and for cascaded clustering.

 

Within the SPEAKER project, we will develop a major Business-2-Business Conversational AI platform. SPEAKER will also have a data platform that can draw on data management tools in order to include large-scale heterogeneous data sources in dialogues.

 

 

b. Have you deployed your Big Data solutions (e.g. in pilots, as MVP, for EU or German projects, directly in a real-life (industrial) context)?

 

The platform and scalable analytics developed in “Big Data Ocean” have received substantial interest from the Maritime industry and are currently being extended.

 

“Big Data Europe” is being used by several community projects and SMEs.

 

SANSA has gained traction in the community. Some details are described in the original framework paper. Recently, it has been used for scalable querying and quality assessment of data sets. Alethio has used it to analyse the Ethereum blockchain.

 

PyKEEN is tested by several projects and industries, e.g. Bayer Crop sciences have shown keen interest in using the tool in their research.

 

c. What is the TRL level?

 

The TRL level differs between different implementations / use cases and is typically between 3 and 7.

 

d. Have you developed specific big data tools? Please elaborate (info re. tool, programming language(s), scalability, links etc.)

 

SANSA: The Scalable Semantic Analytics Stack is developed for distributed analytics using Apache Spark and Flink in Scala. SANSA provides several layers for Knowledge Graph representation including basic operations like statistics, or quality assurance, Querying, Inference, and Analytics.

 

3. Why is big data important? (e.g. for business / economy / academia / society)

 

From our perspective, the main value of Big Data lies in the ability to perform scalable analytics, in particular over commodity hardware. It can be useful whenever the memory of a standard server is insufficient to carry out a particular task.

 

4. What is your advice for aspiring big data experts? (e.g., w.r.t. training / studies, job search, which programming languages to learn, implementation of big data projects etc.)

 

A full answer would be very long. We recommend to invest into learning underlying principles with proper education and training. (We are providing trainings at Fraunhofer, which can be useful.)

 

5. In your opinion how can big data be combined with AI and what can be the resulting benefits?

 

Both terms do not have a clearly defined meaning and are understood differently among members of the community.

 

Big Data in a narrow sense is about distributed computing for large-scale data sources. The main innovation of Big Data approaches is their ability to scale computing over many (often commodity) machines.

 

AI can be seen as a subfield of computer science that encompasses approaches that are perceived as intelligent by humans – at least this is my favorite definition. It shows that the perception of AI is subjective and changes over time. Route planning is an example of an area, which is moving from being an AI topic to a mainstream technology, which is no longer commonly perceived as intelligent.

 

With this in mind, the combination is clear: Whenever intelligent tasks need to be performed at scale, the two areas of Big Data and AI intersect and benefit from each other. This ranges from search, conversational AI to prediction and clustering tasks over large-scale data sources. The SANSA framework and other libraries built on top of computing frameworks like Spark or Flink are examples of this intersection – they all aim to provide intelligent analytics at scale.

 

6. Would you be interested in collaborating with USA institutions (companies and academia) for projects in the domain of big data? (if yes, please indicate how contact can be made; e,g. via email, linkedin, twitter etc. )

 

Generally, I am interested in performing Big Data and Conversational AI projects with US companies. Contact should ideally be made via mail (see http://jens-lehmann.org/ and https://www.iais.fraunhofer.de/de/institut/mitarbeiterprofile/jens-lehmann.html ) for details.

 

 


You are now leaving BRIDGE-US

BRIDGE-US provides links to web sites of other organizations in order to provide visitors with certain information. A link does not constitute an endorsement of content, viewpoint, policies, products or services of that web site. Once you link to another web site not maintained by BRIDGE-US, you are subject to the terms and conditions of that web site, including but not limited to its privacy policy.

You will be redirected to

Click the link above to continue or CANCEL

EN|DE