Also known as DFD, Data flow diagrams are used to graphically represent the flow of data in a business information system. DFD describes the processes that are involved in a system to transfer data from the input to the file storage and reports generation.
Data flow diagrams can be divided into logical and physical. The logical data flow diagram describes flow of data through a system to perform certain functionality of a business. The physical data flow diagram describes the implementation of the logical data flow.
Need to create Data Flow Diagram? Visual Paradigm features a professional DFD editor that enables you to create DFDs faster, easier and better. It is an international award-winning modeler, and yet it is easy-to-use.
Learn MoreDFD graphically representing the functions, or processes, which capture, manipulate, store, and distribute data between a system and its environment and between components of a system. The visual representation makes it a good communication tool between User and System designer. Structure of DFD allows starting from a broad overview and expand it to a hierarchy of detailed diagrams. DFD has often been used due to the following reasons:
There are four basic symbols that are used to represent a data-flow diagram.
A process receives input data and produces output with a different content or form. Processes can be as simple as collecting input data and saving in the database, or it can be complex as producing a report containing monthly sales of all retail stores in the northwest region.
Every process has a name that identifies the function it performs.
The name consists of a verb, followed by a singular noun.
Example:
A data-flow is a path for data to move from one part of the information system to another. A data-flow may represent a single data element such the Customer ID or it can represent a set of data element (or a data structure).
Example:
Note that:
Because every process changes data from one form into another, at least one data-flow must enter and one data-flow must exit each process symbol.
One of the rule for developing DFD is that all flow must begin with and end at a processing step. This is quite logical, because data can't transform on its own with being process. By using the thumb rule, it is quite easily to identify the illegal data flows and correct them in a DFD.
Wrong | Right | Description |
---|---|---|
![]() |
![]() |
An entity cannot provide data to another entity without some processing occurred. |
![]() |
![]() |
Data cannot move directly from an entity to a data story without being processed. |
![]() |
![]() |
Data cannot move directly from a data store without being processed. |
![]() |
![]() |
Data cannot move directly from one data store to another without being processed. |
A second class of DFD mistakes arise when the outputs from one processing step do not match its inputs and they can be classified as:
A data store or data repository is used in a data-flow diagram to represent a situation when the system must retain data because one or more processes need to use the stored data in a later time.
Note that:
An external entity is a person, department, outside organization, or other information system that provides data to the system or receives outputs from the system. External entities are components outside of the boundaries of the information systems. They represent how the information system interacts with the outside world.
Note that:
Top-down decomposition, also called leveling, is a technique used to show more detail in lower-level DFDs. Leveling is done by drawing a series of increasingly detailed diagrams until the desired degree of detail is reached. As shown in the Figure, DFD Leveling is first displaying the targeted system as a single process, and then showing more detail until all processes are functional primitives.
When performing top-down decomposition to a DFD to lower level DFDs, the inputs and outputs must be conserved between levels of DFDs. For example, level n & n+1 must have the same inputs and outputs
A context diagram gives an overview and it is the highest level in a data flow diagram, containing only one process representing the entire system. It should be split into major processes which give greater detail and each major process may further split to give more detail.
Processes in diagram 0 (with a whole number) can be exploded further to represent details of the processing activities. Example below shows the next level ((Diagram 1) of process explosion.
Note that:
Although the following level 1 DFD only has three processes, there are quite a few input and input from the processes to the external entities and that could end up to be a few cross lines among them in the diagram; to avoid this problem, we could use (master and auxiliary view) multiple views of the same external entity in the DFD.
If a process with a lot of data flow linking between a few external entities, we could first extract that particular process and the associated external entities into a separate diagram similar to a context diagram, before you refine the process into a separate level of DFD; and by this way you can ensure the consistency between them much easier.
Data flow diagrams are categorized as either logical or physical. A logical data flow diagram focuses on the business and how the business operates. It is not concerned with how the system will be constructed. We can ignore implementation specifics such as, computer configuration, data storage technology, communication or message passing methods by focusing on the functions performed by the system, such as, data collection, data to information transformation and information reporting.
A physical data flow diagram shows how the system will be implemented, including the hardware, software, files, and people in the system. It is developed such that the processes described in the logical data flow diagrams are implemented correctly to achieve the goal of the business.
The example below shows a logical DFD and a physical DFD for a grocery store cashier:
The logical DFD illustrates the processes involved without going into detail about the physical implementation of activities.