|
Combining data using an Inner Join
Following is an example on how a misconception about the join principle can generate faulty data, and another example on how to generate the acceptable output.
EXAMPLE DATABASE: Training2DATA INPUTThe data input are city table and country table from the Training2.mdb database. The city table contains the country code, city name and city population.
QUERY join1SELECT * FROM city, country; The join1 query tries to combine the records of the country table and the city table, but the results returned are unacceptable.
DISCUSSIONThe goal of the previous query is to combine the rows of the city table to the corresponding rows of the country table. But the query generated faulty result. It selects everything from both tables and join the records in a way such that, for each record in the city table, it will merge all the records in the country table. The output produced is an example of what is called a Cartesian product. A misconception in the query is that all the records from both tables will be selected and the result will have the correct data expected from it. A factor that was neglected in the query is the specification on how those records must be combined. In the earlier definition of join, it is stated, that there should be matching values between the binding fields that are being compared upon. The condition on how data must be combined is always a requirement in making joins, otherwise, an erroneous result will be generated.
QUERY join2SELECT * FROM city, country WHERE city.cnt_code = country.cnt_code The join2 query tries to combine the corresponding records of the country table and the city table using the country code of both tables as its binding field. The binding field of both tables must have the same value for a join to take place. This JOIN query provides the expected combined records of the city and country tables.
DISCUSSIONThe goal of the query to correctly combine data from the country table and the city table is obtained because of the WHERE condition added in the syntax. The condition is that the city.cnt_code should be the same as country.cnt_code. This query uses an implied join statement to combine data. But there is another way on how data from those tables can be combined by explicitly using the INNER JOIN statement. This statement have the same output as the join2 query. SELECT * FROM city INNER JOIN country ON city.cnt_code=country.cnt_code; In the query result, it can be observed that the binding field (cnt_code) have the table names preceding it. It is bacause the field names are the same for the two tables and SQL would like to indicate from what table that particular field was derived. For other fields, the table names are not included because they are unique fields for each table.
Last modified September 23, 2008 6:06 am
|