1.2
Lot Table:
realestate=>
relname | reltuples | relpages
---------+-----------+----------
lot | 249999 | 2688
(1 row)
Customer Table:
realestate=> select relname, reltuples, relpages
realestate-> from pg_class
realestate-> where relname='customer';
relname | reltuples | relpages
----------+-----------+----------
customer | 249999 | 1441
(1 row)
Customer_lot Table:
realestate=> select relname, reltuples, relpages
realestate-> from pg_class
realestate-> where relname='customer_lot';
relname | reltuples | relpages
--------------+-----------+----------
customer_lot | 249999 | 1107
(1 row)
The blocking factor of the lot relation is 249999 / 2688 = 93.
The blocking factor of the customer relation is 249999/1441 = 173.
The blocking factor of the customer_lot relation is 249999/1107 = 225.
1.3
Selecting the lot_id for any lot in a given size range.Such a query would be select lot_id from lot where lot_size between 300 and 15000;
realestate=> explain analyze select lot_id from lot where lot_size between 300 and 15000;
NOTICE: QUERY PLAN:
"Seq Scan on lot (cost=0.00..6437.98 rows=146816 width=4) (actual time=0.125..470.049 rows=147220 loops=1)"
" Filter: ((lot_size >= 300::double precision) AND (lot_size <= 15000::double precision))"
"Total runtime: 785.181 ms"
EXPLAIN
Selecting the lot_id for any lot in a given value range. Such a query would be select lot_id from lot where lot_value between 3000 and 15000;
realestate=> explain analyze select lot_id from lot where lot_value between 3000 and 15000;
NOTICE: QUERY PLAN:
"Seq Scan on lot (cost=0.00..6437.98 rows=2996 width=4) (actual time=0.140..191.434 rows=3045 loops=1)"
" Filter: ((lot_value >= 3000::double precision) AND (lot_value <= 15000::double precision))"
"Total runtime: 199.254 ms"
EXPLAIN
realestate=> explain analyze select * from customer where customer_id=12;
NOTICE: QUERY PLAN:
"Seq Scan on customer (cost=0.00..4565.99 rows=1 width=16) (actual time=0.106..123.381 rows=1 loops=1)"
" Filter: (customer_id = 12)"
"Total runtime: 123.466 ms"
EXPLAIN
realestate=> explain analyze insert into customer values (250001, 'Vince', 'Smith' );
NOTICE: QUERY PLAN:
"Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.004..0.008 rows=1 loops=1)"
"Total runtime: 0.820 ms"
EXPLAIN
realestate=> explain analyze delete from customer where customer_id='250001';
NOTICE: QUERY PLAN:
"Seq Scan on customer (cost=0.00..4565.99 rows=1 width=6) (actual time=120.605..120.610 rows=1 loops=1)"
" Filter: (customer_id = 250001)"
"Total runtime: 120.728 ms"
EXPLAIN
realestate=> explain analyze update customer set customer_first_name='Vinny' where customer_id='249001';
NOTICE: QUERY PLAN:
"Seq Scan on customer (cost=0.00..4565.99 rows=1 width=16) (actual time=119.360..119.781 rows=1 loops=1)"
" Filter: (customer_id = 249001)"
"Total runtime: 120.039 ms"
EXPLAIN
realestate=> explain analyze select avg(lot_value) from lot;
NOTICE: QUERY PLAN:
"Aggregate (cost=5812.99..5813.00 rows=1 width=8) (actual time=1217.893..1217.896 rows=1 loops=1)"
" -> Seq Scan on lot (cost=0.00..5187.99 rows=249999 width=8) (actual time=0.076..618.552 rows=249999 loops=1)"
"Total runtime: 1218.034 ms"
EXPLAIN
1.4
For each query above, suggest an index, if applicable, to improve performance. To answer this question completely, you must state what columns you are indexing and what index you will use. You must fully defend your choice with a complete explanation. If an index is not appropriate for a query, clearly state why. Note that you may only use indexes supported in PostgreSQL, e.g. hash or b-tree. You may also use clustering.
Query types 1 and 2 both use a sequential scan for a particular range of values. These queries may be aided by a b-tree on the columns referenced in the where clause.
Query type 3 uses an equality selection condition based on customer id. In this case, the performance will be helped using a hash on the customer id.
Query 4 is an insert and should not be indexed, although, it is on a value that is also queried in query 2. In this case, the b-tree from query 2 will most likely cause performance to suffer.
Queries 5 and 6 both use sequential scans and will be aided by a hash on the customer id and the lot id.
Query 7 is an aggregate function that performs a sequential scan of the lot table.
These indexes may be built with the following sql statements:
create index customer_id_index on customer using hash(customer_id);
create index lot_id_index on lot using hash(lot_id);
create index lot_value_index on lot using btree(lot_value);
create index lot_size_index on lot using btree(lot_size);
The lot_value index may be clustered using cluster lot_value_index on lot. The lot_size index may be clustered using cluster lot_size_index on lot.
1.5
Implement the indexes that you proposed in the previous step and analyze the run-time performance of the same queries that you ran in step three. Submit a table showing the number of estimated and actual total cost expressed in the number of disk page accesses and query run times after the index was used. Also compute the percentage increase or decrease in performance (based on run time) that results from indexing the tables. If no performance benefit was gained from the index, you must explain why.
<1> Selecting the lot_id for all lots in a given size range. An example of such a query would be select lot_id from lot where lot_size between 300 and 15000;
realestate=> explain analyze
realestate-> select lot_id
realestate-> from lot
realestate-> where lot_size between 300 and 15000;
QUERY PLAN
--------------------------------------------------------------------------------
----------------------------
"Seq Scan on lot (cost=0.00..6437.98 rows=146816 width=4) (actual time=0.027..456.592 rows=147220 loops=1)"
"Total runtime: 772.519 ms"
<2> Selecting the lot_id for all lots in a given value range. An example of such a query would be select lot_id from lot where lot_value between 3000 and 15000;
realestate=> explain analyze
realestate-> select lot_id
realestate-> from lot
realestate-> where lot_value between 3000 and 15000;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------
"Bitmap Heap Scan on lot (cost=66.99..2924.62 rows=2996 width=4) (actual time=3.065..16.461 rows=3045 loops=1)"
"Total runtime: 25.196 ms"
(3 行)
<3> Selecting all of the information for a specific customer. An example of such a query would be select * from customer where customer_id=12;
realestate=> explain analyze
realestate-> select *
realestate-> from customer
realestate-> where customer_id=12;
QUERY PLAN
--------------------------------------------------------------------------------
-----------------------------------------------
"Index Scan using customer_id_index on customer (cost=0.00..8.31 rows=1 width=16) (actual time=0.042..0.047 rows=1 loops=1)"
"Total runtime: 0.127 ms"
(3 行)
<4> Inserting new customer or lot data. An example of such a query would be insert into customer values (250001, 'Vince', 'Smith' );
realestate=> explain analyze
realestate-> insert into customer
realestate-> values(250001,'Vince','Smith');
QUERY PLAN
--------------------------------------------------------------------------------
----
"Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.005..0.008 rows=1 loops=1)"
"Total runtime: 0.197 ms"
(2 �
- 1
- 2
- 3
前往页