Press "Enter" to skip to content

Requirement analysis using deep learning


Chaitanya Jariwala

Department of Computer Science

Haven't found the right essay?
Get an expert to write you the one you need

Nirma University

Ahmedabad, India

line 1: 4th Given Name Surname

line 2: dept. name of organization

(of Affiliation)

line 3: name of organization

(of Affiliation)

line 4: City, Country

line 5: email address or ORCID

line 1: 2nd Given Name Surname

line 2: dept. name of organization

(of Affiliation)

line 3: name of organization

(of Affiliation)

line 4: City, Country

line 5: email address or ORCID

line 1: 5th Given Name Surname

line 2: dept. name of organization

(of Affiliation)

line 3: name of organization

(of Affiliation)

line 4: City, Country

line 5: email address or ORCID

line 1: 3rd Given Name Surname

line 2: dept. name of organization

(of Affiliation)

line 3: name of organization

(of Affiliation)

line 4: City, Country

line 5: email address or ORCID

line 1: 6th Given Name Surname

line 2: dept. name of organization

(of Affiliation)

line 3: name of organization

(of Affiliation)

line 4: City, Country

line 5: email address or ORCID

Abstract— Requirement Analysis is the fundamental advance in software improvement process. The requirements expressed by the customers are investigated and a reflection of it, is made which is named as requirement model. We have presented use Machine Learning and Deep learning approaches to make this process less burdensome. Using Natural Language Processing, we can extract the features from the requirements mentioned by the stakeholders. This can be done using Recurrent neural networks.

Keywords—Software, Requirements, UML, Deep Leaning, Neural Networks (key words)


Requirement Analysis can be very tedious for software engineers. The process includes a lot of manual work. So, by applying various models of deep learning, we can reduce this task.


A. Natural Language Processing

When the client enters the requirements, there ought to be a system to extricate the data and to comprehend the content from right now accessible procedures. To examine a lot of content information, at present NLP is the main accessible method for the designers. NLP is a zone of research and application that investigates how PCs can be utilized to comprehend and control normal language content or discourse to do valuable things.

1) Sentence splitting

Utilizing this methodology, the proposed framework is normal to part the all content into sentences after client entering the requirements

2) Lexical Analysis

Lexical investigation approach will get the split sentences and it will tokenize the sentences into words.

3) Syntax Analysis

From this, the module gets the lexical tokens as the information and applies a standard based methodology. It creates yields in type of grammatical forms in a given content.

4) Word chunking

By utilizing a piecing way to deal with the proposed framework, the fundamental desire is to get the use cases from the info content. It recognizes noun phrases (NP), verb word phrases (VP), and prepositional expressions (PP) utilizing tokenized content and POS labels.

B. Deep Learning

Deep learning is a piece of a more extensive group of AI techniques dependent on learning information portrayals, rather than undertaking explicit calculations. Learning can be directed, semi-supervised or unsupervised. Deep learning is a specialized form of machine learning. It primarily uses a Neural Network. Types of neural networks commonly used in Natural language Processing is a Recurrent Neural Network.

Other essay:   How to make a garden in your home’s balcony


We present an approach to automatically classify content elements of a natural language requirements specification as requirement or information. This approach can be used either to classify content elements in documents that have not been classified before or to analyse already classified documents and support the author in identifying incorrectly classified content elements. Our approach uses convolutional neural networks, a machine learning approach that has recently gained attention in natural language processing

To comprehend the diverse sorts of content contained in a requirements, we can contemplate some Requirement Specification records. In spite of the fact that the reports are composed by various individuals, the sort of substance and the structure of the records are very comparable. Coming up next are a few observations concerning the same.

• Most content components contain normal language content. We saw that stating will in general be progressively exact for requirements contrasted and extra data.

• We distinguished one of a kind expressing and organizing inside the substance components of individual specification documents. This is because of the way that these documents are made by various creators with somewhat extraordinary understandings about how these documents ought to be made.

• Notwithstanding characteristic language content components, many substance components are made utilizing organized and semi-formal documentations, for example, identifications, tables, charts, conditions, logical expressions, and key-value sets (e.g., “Most extreme Voltage: 10mV”).

• Certain substance components are constantly characterized utilizing a similar mark. For instance, components that speak to references to outer documents are constantly grouped as data, though voltage run specifications are constantly delegated necessity.

• In certain requirements specifications, the characteristic kind isn’t characterized reliably. Comments, which clearly are not requirements, are not named as data now and again. Now and then a characterization was totally absent for the entire document.

A. Convolutional Neural Network for NLP

Ordering content components of requirements specifications as either necessity or data is a two-class grouping issue. Inside the common language preparing network, numerous well known strategies exist to take care of such an issue, including Naive Bayes and support vector machines. In spite of the fact that these procedures have confinements, for example, overlooking word request, they have turned out to be adequate for order errands, for example, slant investigation or initiation attribution.

Convolutional neural networks (CNN) are a variety of great feed-forward neural networks, which are generally utilized inside the picture acknowledgment network however have as of late picked up consideration in characteristic language handling also. These networks have a few favorable circumstances contrasted and other classification procedures:

• Techniques, for example, Naive Bayes and support vector machines regularly depend on the pack of-words way to deal with conversion of common language sentences into machine reasonable element vectors. Data about the request of words in the sentence is lost simultaneously. CNNs for common language handling work on a sentence portrayal that maintains word control flawless. In this way, CNNs may learn and perceive designs comprising of word groupings traversing numerous words in a sentence.

Other essay:   Project management

• To change over a characteristic language sentence into a machine justifiable organization, a word vectorization system, for example, word2vec or GloVe is utilized. This enables the network to perceive designs regardless of whether the words utilized in the events of the example differ somewhat.

The organization and functionality of CNNs as applied in this paper is illustrated in Fig. 1 and will be described briefly in the following section.

Fig. 1: Convolutional neural network architecture

The first step is to change an information sentence into a vector portrayal. This is called word implanting. We use word2vec for this progression. Word2vec maps a solitary word to a vector v ∈ Rn, where n is known as the implanting size. One exceptional property of word2vec is that the vector separation of two given words is little if these two words are utilized in comparable settings though it is vast if the words are not related by any stretch of the imagination. Sentences are changed into a grid m ∈Rn”,l, where l is the quantity of words in the sentence.

The first layer in the network applies a predefined set of filters to the sentence lattice m. Each filter is a lattice f ∈Rn”,o of trainable parameters, where n is the implanting size and o is the length of that specific filter. Number and sizes of the filters are hyper parameters and thusly physically defined before preparing. In Fig. 1, two filters of length 3 and two filters of length 2 are outlined. Channels are connected to a sentence framework by moving them as a sliding window over the sentence grid, creating a solitary incentive at each position utilizing an enactment function, for example, a rectifier or sigmoid function. This progression is called convolution. Each filter figures out how to perceive a specific word design.

All qualities delivered by a filter are then diminished to a solitary value by applying 1-max-pooling. The maximum pooled values show whether the example learned by a filter is available inside a sentence. All subsequent qualities are connected and structure an element vector. This vector is associated with the yield layer utilizing a standard completely associated layer and a suitable arrangement of trainable parameters. The completely associated layer is utilized to connect certain examples with a yield class (e.g., the network may figure out how to relate the example “must be” with the class “prerequisite”). A softmax layer is finally used to make a genuine probability appropriation.

B. Approach

The procedure to develop a classifier that can recognize data from requirements in normal language sentences contains the following 9 stages:

• Understanding the application space. We have to a lot of analysis to gain a better understanding about requirements and information.

• Creating a dataset. Next step is to make a dataset utilizing the learning picked up in the past advance.

• Pre-processing and purifying the data. We The data cannot be used in the raw form, it is needed to be preprocessed before it can be used in the model.

• Data change. We should change the data into an arrangement proper for preparing an AI algorithm. We can choose, for example, the word2vec word implanting method.

• Choosing the proper data mining task. The issue of determining the sort (requirements or data) of a given substance component is a classification issue. Do not use the word “essentially” to mean “approximately” or “effectively”.

Other essay:   Water and development

• Choosing the data mining algorithm. We chose CNNs for moving toward our concern because of their ongoing accomplishment in numerous regular characteristic language issues.

• Employing the data mining algorithm. The method of choosing hyper parameters and preparing of the model is essential for a good model.

• Evaluation. A good start would be by applying it to a requirements specification from industry.

• Using the information. Conceivable approaches are used to consolidate the made model into an apparatus for quality confirmation.

C. Limitations of the approach

There are limitations of neural networks. Using them, we cannot achieve 100% accuracy, but with optimization of hyper parameters, around 80 to 90% accuracy can be achieved. The statements that are very common in the dataset are classified very accurately. But the statements having phrases which occur in low frequency in the dataset may be classified incorrectly.

The network does not give any knowledge into what it realizes and why a specific yield is created. This issue is notable inside the neural network. Numerous methodologies have been proposed to manage this issue, both nonexclusive methodologies, for example, fluffy standard extraction from neural networks just as space specific approaches (i.e., envisioning loads of deep picture acknowledgment networks). A system to follow back choices through the network to distinguish applicable examples in the information sentence would absolutely be critical to genuine clients, particularly while consolidating our methodology into an instrument.

Another constraint is that the materialness of our methodology may be restricted to the documents of the business accomplice whose documents we used to prepare the network. In the event that we utilize the prepared network to group content components of documents given by an alternate industry accomplice, we may get substandard outcomes.

Another issue that ordinarily emerges with AI strategies is overfitting. Our network may be intensely one-sided towards specific and frequently reoccurring words and examples in our preparation set and subsequently probably won’t be material to different documents. Despite everything we have to break down whether this is an issue with our model.

D. Applications in Industries

To apply this methodology in industry, we can incorporate a pretrained CNN into a tool. This tool will help the requirements engineer in three unique situations:

The requirements architect may utilize the tool to distinguish misclassified content components in a document in which content components were at that point classified. The tool will dissect each substance component and issue admonitions if a misclassified thing is identified.

The tool may likewise be utilized to make an underlying classification for every single substance component of a document in which content components were not classified.

Presenting a tool that supports these situations, writers can recognize misclassified things quicker and are urged to compose higher quality (i.e., simpler to order) content components. In a perfect world, the tool would likewise give clarifications why a specific cautioning or comment is issued by featuring specific parts of a substance component.

Be First to Comment

Leave a Reply

Your email address will not be published.

Share via
Copy link

Spelling error report

The following text will be sent to our editors: