Add an explicit representation of the output targetlist to Paths.

Up to now, there's been an assumption that all Paths for a given relation compute the same output column set (targetlist). However, there are good reasons to remove that assumption. For example, an indexscan on an expression index might be able to return the value of an expensive function "for free". While we have the ability to generate such a plan today in simple cases, we don't have a way to model that it's cheaper than a plan that computes the function from scratch, nor a way to create such a plan in join cases (where the function computation would normally happen at the topmost join node). Also, we need this so that we can have Paths representing post-scan/join steps, where the targetlist may well change from one step to the next. Therefore, invent a "struct PathTarget" representing the columns we expect a plan step to emit. It's convenient to include the output tuple width and tlist evaluation cost in this struct, and there will likely be additional fields in future. While Path nodes that actually do have custom outputs will need their own PathTargets, it will still be true that most Paths for a given relation will compute the same tlist. To reduce the overhead added by this patch, keep a "default PathTarget" in RelOptInfo, and allow Paths that compute that column set to just point to their parent RelOptInfo's reltarget. (In the patch as committed, actually every Path is like that, since we do not yet have any cases of custom PathTargets.) I took this opportunity to provide some more-honest costing of PlaceHolderVar evaluation. Up to now, the assumption that "scan/join reltargetlists have cost zero" was applied not only to Vars, where it's reasonable, but also PlaceHolderVars where it isn't. Now, we add the eval cost of a PlaceHolderVar's expression to the first plan level where it can be computed, by including it in the PathTarget cost field and adding that to the cost estimates for Paths. This isn't perfect yet but it's much better than before, and there is a way forward to improve it more. This costing change affects the join order chosen for a couple of the regression tests, changing expected row ordering.
2016-02-18 20:01:49 -05:00 · 2016-02-18 20:01:49 -05:00 · 19a541143a
commit 19a541143a
parent 3386f34cdc
18 changed files with 337 additions and 154 deletions
--- a/contrib/file_fdw/file_fdw.c
+++ b/contrib/file_fdw/file_fdw.c
@ -806,7 +806,7 @@ check_selective_binary_conversion(RelOptInfo *baserel,
 	}

 	/* Collect all the attributes needed for joins or final output. */
-	pull_varattnos((Node *) baserel->reltargetlist, baserel->relid,
+	pull_varattnos((Node *) baserel->reltarget.exprs, baserel->relid,
 				   &attrs_used);

 	/* Add all the attributes used by restriction clauses. */
@ -938,7 +938,7 @@ estimate_size(PlannerInfo *root, RelOptInfo *baserel,
 		 */
 		int			tuple_width;

-		tuple_width = MAXALIGN(baserel->width) +
+		tuple_width = MAXALIGN(baserel->reltarget.width) +
 			MAXALIGN(SizeofHeapTupleHeader);
 		ntuples = clamp_row_est((double) stat_buf.st_size /
 								(double) tuple_width);
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@ -728,10 +728,10 @@ build_tlist_to_deparse(RelOptInfo *foreignrel)
 	PgFdwRelationInfo *fpinfo = (PgFdwRelationInfo *) foreignrel->fdw_private;

 	/*
-	 * We require columns specified in foreignrel->reltargetlist and those
+	 * We require columns specified in foreignrel->reltarget.exprs and those
 	 * required for evaluating the local conditions.
 	 */
-	tlist = add_to_flat_tlist(tlist, foreignrel->reltargetlist);
+	tlist = add_to_flat_tlist(tlist, foreignrel->reltarget.exprs);
 	tlist = add_to_flat_tlist(tlist,
 							  pull_var_clause((Node *) fpinfo->local_conds,
 											  PVC_REJECT_AGGREGATES,
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@ -479,7 +479,7 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	 * columns used in them.  Doesn't seem worth detecting that case though.)
 	 */
 	fpinfo->attrs_used = NULL;
-	pull_varattnos((Node *) baserel->reltargetlist, baserel->relid,
+	pull_varattnos((Node *) baserel->reltarget.exprs, baserel->relid,
 				   &fpinfo->attrs_used);
 	foreach(lc, fpinfo->local_conds)
 	{
@ -522,7 +522,7 @@ postgresGetForeignRelSize(PlannerInfo *root,

 		/* Report estimated baserel size to planner. */
 		baserel->rows = fpinfo->rows;
-		baserel->width = fpinfo->width;
+		baserel->reltarget.width = fpinfo->width;
 	}
 	else
 	{
@ -539,7 +539,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
 		{
 			baserel->pages = 10;
 			baserel->tuples =
-				(10 * BLCKSZ) / (baserel->width + MAXALIGN(SizeofHeapTupleHeader));
+				(10 * BLCKSZ) / (baserel->reltarget.width +
+								 MAXALIGN(SizeofHeapTupleHeader));
 		}

 		/* Estimate baserel size as best we can with local statistics. */
@ -2176,7 +2177,7 @@ estimate_path_cost_size(PlannerInfo *root,
 		 * between foreign relations.
 		 */
 		rows = foreignrel->rows;
-		width = foreignrel->width;
+		width = foreignrel->reltarget.width;

 		/* Back into an estimate of the number of retrieved rows. */
 		retrieved_rows = clamp_row_est(rows / fpinfo->local_conds_sel);
@ -3646,7 +3647,7 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
 							&width, &startup_cost, &total_cost);
 	/* Now update this information in the joinrel */
 	joinrel->rows = rows;
-	joinrel->width = width;
+	joinrel->reltarget.width = width;
 	fpinfo->rows = rows;
 	fpinfo->width = width;
 	fpinfo->startup_cost = startup_cost;
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@ -1159,7 +1159,7 @@ GetForeignServerByName(const char *name, bool missing_ok);
     it contains restriction quals (<literal>WHERE</> clauses) that should be
     used to filter the rows to be fetched.  (The FDW itself is not required
     to enforce these quals, as the core executor can check them instead.)
-     <literal>baserel-&gt;reltargetlist</> can be used to determine which
+     <literal>baserel-&gt;reltarget.exprs</> can be used to determine which
     columns need to be fetched; but note that it only lists columns that
     have to be emitted by the <structname>ForeignScan</> plan node, not
     columns that are used in qual evaluation but not output by the query.
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@ -1589,6 +1589,7 @@ _outOnConflictExpr(StringInfo str, const OnConflictExpr *node)
 *
 * Note we do NOT print the parent, else we'd be in infinite recursion.
 * We can print the parent's relids for identification purposes, though.
+ * We print the pathtarget only if it's not the default one for the rel.
 * We also do not print the whole of param_info, since it's printed by
 * _outRelOptInfo; it's sufficient and less cluttering to print just the
 * required outer relids.
@ -1598,10 +1599,14 @@ _outPathInfo(StringInfo str, const Path *node)
 {
 	WRITE_ENUM_FIELD(pathtype, NodeTag);
 	appendStringInfoString(str, " :parent_relids ");
-	if (node->parent)
-		_outBitmapset(str, node->parent->relids);
-	else
-		_outBitmapset(str, NULL);
+	_outBitmapset(str, node->parent->relids);
+	if (node->pathtarget != &(node->parent->reltarget))
+	{
+		WRITE_NODE_FIELD(pathtarget->exprs);
+		WRITE_FLOAT_FIELD(pathtarget->cost.startup, "%.2f");
+		WRITE_FLOAT_FIELD(pathtarget->cost.per_tuple, "%.2f");
+		WRITE_INT_FIELD(pathtarget->width);
+	}
 	appendStringInfoString(str, " :required_outer ");
 	if (node->param_info)
 		_outBitmapset(str, node->param_info->ppi_req_outer);
@ -1901,11 +1906,13 @@ _outRelOptInfo(StringInfo str, const RelOptInfo *node)
 	WRITE_ENUM_FIELD(reloptkind, RelOptKind);
 	WRITE_BITMAPSET_FIELD(relids);
 	WRITE_FLOAT_FIELD(rows, "%.0f");
-	WRITE_INT_FIELD(width);
 	WRITE_BOOL_FIELD(consider_startup);
 	WRITE_BOOL_FIELD(consider_param_startup);
 	WRITE_BOOL_FIELD(consider_parallel);
-	WRITE_NODE_FIELD(reltargetlist);
+	WRITE_NODE_FIELD(reltarget.exprs);
+	WRITE_FLOAT_FIELD(reltarget.cost.startup, "%.2f");
+	WRITE_FLOAT_FIELD(reltarget.cost.per_tuple, "%.2f");
+	WRITE_INT_FIELD(reltarget.width);
 	WRITE_NODE_FIELD(pathlist);
 	WRITE_NODE_FIELD(ppilist);
 	WRITE_NODE_FIELD(partial_pathlist);
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@ -919,19 +919,20 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		/*
 		 * CE failed, so finish copying/modifying targetlist and join quals.
 		 *
-		 * Note: the resulting childrel->reltargetlist may contain arbitrary
-		 * expressions, which otherwise would not occur in a reltargetlist.
+		 * Note: the resulting childrel->reltarget.exprs may contain arbitrary
+		 * expressions, which otherwise would not occur in a rel's targetlist.
 		 * Code that might be looking at an appendrel child must cope with
-		 * such.  (Normally, a reltargetlist would only include Vars and
-		 * PlaceHolderVars.)
+		 * such.  (Normally, a rel's targetlist would only include Vars and
+		 * PlaceHolderVars.)  XXX we do not bother to update the cost or width
+		 * fields of childrel->reltarget; not clear if that would be useful.
 		 */
 		childrel->joininfo = (List *)
 			adjust_appendrel_attrs(root,
 								   (Node *) rel->joininfo,
 								   appinfo);
-		childrel->reltargetlist = (List *)
+		childrel->reltarget.exprs = (List *)
 			adjust_appendrel_attrs(root,
-								   (Node *) rel->reltargetlist,
+								   (Node *) rel->reltarget.exprs,
 								   appinfo);

 		/*
@ -976,7 +977,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		Assert(childrel->rows > 0);

 		parent_rows += childrel->rows;
-		parent_size += childrel->width * childrel->rows;
+		parent_size += childrel->reltarget.width * childrel->rows;

 		/*
 		 * Accumulate per-column estimates too.  We need not do anything for
@ -984,10 +985,10 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 		 * Var, or we didn't record a width estimate for it, we have to fall
 		 * back on a datatype-based estimate.
 		 *
-		 * By construction, child's reltargetlist is 1-to-1 with parent's.
+		 * By construction, child's targetlist is 1-to-1 with parent's.
 		 */
-		forboth(parentvars, rel->reltargetlist,
-				childvars, childrel->reltargetlist)
+		forboth(parentvars, rel->reltarget.exprs,
+				childvars, childrel->reltarget.exprs)
 		{
 			Var		   *parentvar = (Var *) lfirst(parentvars);
 			Node	   *childvar = (Node *) lfirst(childvars);
@ -1022,7 +1023,7 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,

 		Assert(parent_rows > 0);
 		rel->rows = parent_rows;
-		rel->width = rint(parent_size / parent_rows);
+		rel->reltarget.width = rint(parent_size / parent_rows);
 		for (i = 0; i < nattrs; i++)
 			rel->attr_widths[i] = rint(parent_attrsizes[i] / parent_rows);

@ -1495,7 +1496,7 @@ set_dummy_rel_pathlist(RelOptInfo *rel)
 {
 	/* Set dummy size estimates --- we leave attr_widths[] as zeroes */
 	rel->rows = 0;
-	rel->width = 0;
+	rel->reltarget.width = 0;

 	/* Discard any pre-existing paths; no further need for them */
 	rel->pathlist = NIL;
@ -1728,11 +1729,11 @@ set_function_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
 		ListCell   *lc;

 		/*
-		 * Is there a Var for it in reltargetlist?	If not, the query did not
-		 * reference the ordinality column, or at least not in any way that
-		 * would be interesting for sorting.
+		 * Is there a Var for it in rel's targetlist?  If not, the query did
+		 * not reference the ordinality column, or at least not in any way
+		 * that would be interesting for sorting.
 		 */
-		foreach(lc, rel->reltargetlist)
+		foreach(lc, rel->reltarget.exprs)
 		{
 			Var		   *node = (Var *) lfirst(lc);

@ -2676,11 +2677,11 @@ remove_unused_subquery_outputs(Query *subquery, RelOptInfo *rel)
 	 * query.
 	 *
 	 * Add all the attributes needed for joins or final output.  Note: we must
-	 * look at reltargetlist, not the attr_needed data, because attr_needed
+	 * look at rel's targetlist, not the attr_needed data, because attr_needed
 	 * isn't computed for inheritance child rels, cf set_append_rel_size().
 	 * (XXX might be worth changing that sometime.)
 	 */
-	pull_varattnos((Node *) rel->reltargetlist, rel->relid, &attrs_used);
+	pull_varattnos((Node *) rel->reltarget.exprs, rel->relid, &attrs_used);

 	/* Add all the attributes used by un-pushed-down restriction clauses. */
 	foreach(lc, rel->baserestrictinfo)
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@ -182,8 +182,6 @@ clamp_row_est(double nrows)
 *
 * 'baserel' is the relation to be scanned
 * 'param_info' is the ParamPathInfo if this is a parameterized path, else NULL
- * 'nworkers' are the number of workers among which the work will be
- *			distributed if the scan is parallel scan
 */
 void
 cost_seqscan(Path *path, PlannerInfo *root,
@ -225,6 +223,9 @@ cost_seqscan(Path *path, PlannerInfo *root,
 	startup_cost += qpqual_cost.startup;
 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
 	cpu_run_cost = cpu_per_tuple * baserel->tuples;
+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->pathtarget->cost.startup;
+	cpu_run_cost += path->pathtarget->cost.per_tuple * path->rows;

 	/* Adjust costing for parallelism, if used. */
 	if (path->parallel_degree > 0)
@ -335,6 +336,9 @@ cost_samplescan(Path *path, PlannerInfo *root,
 	startup_cost += qpqual_cost.startup;
 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;
+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->pathtarget->cost.startup;
+	run_cost += path->pathtarget->cost.per_tuple * path->rows;

 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
@ -601,6 +605,10 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count)

 	run_cost += cpu_per_tuple * tuples_fetched;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->path.pathtarget->cost.startup;
+	run_cost += path->path.pathtarget->cost.per_tuple * path->path.rows;
+
 	path->path.startup_cost = startup_cost;
 	path->path.total_cost = startup_cost + run_cost;
 }
@ -910,6 +918,10 @@ cost_bitmap_heap_scan(Path *path, PlannerInfo *root, RelOptInfo *baserel,

 	run_cost += cpu_per_tuple * tuples_fetched;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->pathtarget->cost.startup;
+	run_cost += path->pathtarget->cost.per_tuple * path->rows;
+
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@ -1141,6 +1153,10 @@ cost_tidscan(Path *path, PlannerInfo *root,
 		tid_qual_cost.per_tuple;
 	run_cost += cpu_per_tuple * ntuples;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->pathtarget->cost.startup;
+	run_cost += path->pathtarget->cost.per_tuple * path->rows;
+
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@ -1185,6 +1201,10 @@ cost_subqueryscan(Path *path, PlannerInfo *root,
 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost = cpu_per_tuple * baserel->tuples;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->pathtarget->cost.startup;
+	run_cost += path->pathtarget->cost.per_tuple * path->rows;
+
 	path->startup_cost += startup_cost;
 	path->total_cost += startup_cost + run_cost;
 }
@ -1242,6 +1262,10 @@ cost_functionscan(Path *path, PlannerInfo *root,
 	cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->pathtarget->cost.startup;
+	run_cost += path->pathtarget->cost.per_tuple * path->rows;
+
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@ -1285,6 +1309,10 @@ cost_valuesscan(Path *path, PlannerInfo *root,
 	cpu_per_tuple += cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->pathtarget->cost.startup;
+	run_cost += path->pathtarget->cost.per_tuple * path->rows;
+
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@ -1328,6 +1356,10 @@ cost_ctescan(Path *path, PlannerInfo *root,
 	cpu_per_tuple += cpu_tuple_cost + qpqual_cost.per_tuple;
 	run_cost += cpu_per_tuple * baserel->tuples;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->pathtarget->cost.startup;
+	run_cost += path->pathtarget->cost.per_tuple * path->rows;
+
 	path->startup_cost = startup_cost;
 	path->total_cost = startup_cost + run_cost;
 }
@ -2080,6 +2112,10 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	cpu_per_tuple = cpu_tuple_cost + restrict_qual_cost.per_tuple;
 	run_cost += cpu_per_tuple * ntuples;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->path.pathtarget->cost.startup;
+	run_cost += path->path.pathtarget->cost.per_tuple * path->path.rows;
+
 	path->path.startup_cost = startup_cost;
 	path->path.total_cost = startup_cost + run_cost;
 }
@ -2250,7 +2286,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 				  outersortkeys,
 				  outer_path->total_cost,
 				  outer_path_rows,
-				  outer_path->parent->width,
+				  outer_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
@ -2276,7 +2312,7 @@ initial_cost_mergejoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 				  innersortkeys,
 				  inner_path->total_cost,
 				  inner_path_rows,
-				  inner_path->parent->width,
+				  inner_path->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
@ -2500,7 +2536,8 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	 * off.
 	 */
 	else if (enable_material && innersortkeys != NIL &&
-			 relation_byte_size(inner_path_rows, inner_path->parent->width) >
+			 relation_byte_size(inner_path_rows,
+								inner_path->pathtarget->width) >
 			 (work_mem * 1024L))
 		path->materialize_inner = true;
 	else
@ -2539,6 +2576,10 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 	cpu_per_tuple = cpu_tuple_cost + qp_qual_cost.per_tuple;
 	run_cost += cpu_per_tuple * mergejointuples;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->jpath.path.pathtarget->cost.startup;
+	run_cost += path->jpath.path.pathtarget->cost.per_tuple * path->jpath.path.rows;
+
 	path->jpath.path.startup_cost = startup_cost;
 	path->jpath.path.total_cost = startup_cost + run_cost;
 }
@ -2671,7 +2712,7 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	 * optimization in the cost estimate, but for now, we don't.
 	 */
 	ExecChooseHashTableSize(inner_path_rows,
-							inner_path->parent->width,
+							inner_path->pathtarget->width,
 							true,		/* useskew */
 							&numbuckets,
 							&numbatches,
@ -2687,9 +2728,9 @@ initial_cost_hashjoin(PlannerInfo *root, JoinCostWorkspace *workspace,
 	if (numbatches > 1)
 	{
 		double		outerpages = page_size(outer_path_rows,
-										   outer_path->parent->width);
+										   outer_path->pathtarget->width);
 		double		innerpages = page_size(inner_path_rows,
-										   inner_path->parent->width);
+										   inner_path->pathtarget->width);

 		startup_cost += seq_page_cost * innerpages;
 		run_cost += seq_page_cost * (innerpages + 2 * outerpages);
@ -2919,6 +2960,10 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	cpu_per_tuple = cpu_tuple_cost + qp_qual_cost.per_tuple;
 	run_cost += cpu_per_tuple * hashjointuples;

+	/* tlist eval costs are paid per output row, not per tuple scanned */
+	startup_cost += path->jpath.path.pathtarget->cost.startup;
+	run_cost += path->jpath.path.pathtarget->cost.per_tuple * path->jpath.path.rows;
+
 	path->jpath.path.startup_cost = startup_cost;
 	path->jpath.path.total_cost = startup_cost + run_cost;
 }
@ -3063,7 +3108,7 @@ cost_rescan(PlannerInfo *root, Path *path,
 				 */
 				Cost		run_cost = cpu_tuple_cost * path->rows;
 				double		nbytes = relation_byte_size(path->rows,
-														path->parent->width);
+													path->pathtarget->width);
 				long		work_mem_bytes = work_mem * 1024L;

 				if (nbytes > work_mem_bytes)
@ -3090,7 +3135,7 @@ cost_rescan(PlannerInfo *root, Path *path,
 				 */
 				Cost		run_cost = cpu_operator_cost * path->rows;
 				double		nbytes = relation_byte_size(path->rows,
-														path->parent->width);
+													path->pathtarget->width);
 				long		work_mem_bytes = work_mem * 1024L;

 				if (nbytes > work_mem_bytes)
@ -3356,6 +3401,20 @@ cost_qual_eval_walker(Node *node, cost_qual_eval_context *context)
 		return cost_qual_eval_walker((Node *) linitial(asplan->subplans),
 									 context);
 	}
+	else if (IsA(node, PlaceHolderVar))
+	{
+		/*
+		 * A PlaceHolderVar should be given cost zero when considering general
+		 * expression evaluation costs.  The expense of doing the contained
+		 * expression is charged as part of the tlist eval costs of the scan
+		 * or join where the PHV is first computed (see set_rel_width and
+		 * add_placeholders_to_joinrel).  If we charged it again here, we'd be
+		 * double-counting the cost for each level of plan that the PHV
+		 * bubbles up through.  Hence, return without recursing into the
+		 * phexpr.
+		 */
+		return false;
+	}

 	/* recurse into children */
 	return expression_tree_walker(node, cost_qual_eval_walker,
@ -3751,7 +3810,7 @@ get_parameterized_baserel_size(PlannerInfo *root, RelOptInfo *rel,
 * anyway we must keep the rowcount estimate the same for all paths for the
 * joinrel.)
 *
- * We set only the rows field here.  The width field was already set by
+ * We set only the rows field here.  The reltarget field was already set by
 * build_joinrel_tlist, and baserestrictcost is not used for join rels.
 */
 void
@ -4156,6 +4215,8 @@ set_foreign_size_estimates(PlannerInfo *root, RelOptInfo *rel)
 * that have to be calculated at this relation.  This is the amount of data
 * we'd need to pass upwards in case of a sort, hash, etc.
 *
+ * This function also sets reltarget.cost, so it's a bit misnamed now.
+ *
 * NB: this works best on plain relations because it prefers to look at
 * real Vars.  For subqueries, set_subquery_size_estimates will already have
 * copied up whatever per-column estimates were made within the subquery,
@ -4174,12 +4235,16 @@ set_rel_width(PlannerInfo *root, RelOptInfo *rel)
 	bool		have_wholerow_var = false;
 	ListCell   *lc;

-	foreach(lc, rel->reltargetlist)
+	/* Vars are assumed to have cost zero, but other exprs do not */
+	rel->reltarget.cost.startup = 0;
+	rel->reltarget.cost.per_tuple = 0;
+
+	foreach(lc, rel->reltarget.exprs)
 	{
 		Node	   *node = (Node *) lfirst(lc);

 		/*
-		 * Ordinarily, a Var in a rel's reltargetlist must belong to that rel;
+		 * Ordinarily, a Var in a rel's targetlist must belong to that rel;
 		 * but there are corner cases involving LATERAL references where that
 		 * isn't so.  If the Var has the wrong varno, fall through to the
 		 * generic case (it doesn't seem worth the trouble to be any smarter).
@ -4239,10 +4304,18 @@ set_rel_width(PlannerInfo *root, RelOptInfo *rel)
 		}
 		else if (IsA(node, PlaceHolderVar))
 		{
+			/*
+			 * We will need to evaluate the PHV's contained expression while
+			 * scanning this rel, so be sure to include it in reltarget.cost.
+			 */
 			PlaceHolderVar *phv = (PlaceHolderVar *) node;
 			PlaceHolderInfo *phinfo = find_placeholder_info(root, phv, false);
+			QualCost	cost;

 			tuple_width += phinfo->ph_width;
+			cost_qual_eval_node(&cost, (Node *) phv->phexpr, root);
+			rel->reltarget.cost.startup += cost.startup;
+			rel->reltarget.cost.per_tuple += cost.per_tuple;
 		}
 		else
 		{
@ -4252,10 +4325,15 @@ set_rel_width(PlannerInfo *root, RelOptInfo *rel)
 			 * can using the expression type information.
 			 */
 			int32		item_width;
+			QualCost	cost;

 			item_width = get_typavgwidth(exprType(node), exprTypmod(node));
 			Assert(item_width > 0);
 			tuple_width += item_width;
+			/* Not entirely clear if we need to account for cost, but do so */
+			cost_qual_eval_node(&cost, node, root);
+			rel->reltarget.cost.startup += cost.startup;
+			rel->reltarget.cost.per_tuple += cost.per_tuple;
 		}
 	}

@ -4292,7 +4370,7 @@ set_rel_width(PlannerInfo *root, RelOptInfo *rel)
 	}

 	Assert(tuple_width >= 0);
-	rel->width = tuple_width;
+	rel->reltarget.width = tuple_width;
 }

 /*
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@ -1550,6 +1550,7 @@ bitmap_scan_cost_est(PlannerInfo *root, RelOptInfo *rel, Path *ipath)
 	bpath.path.type = T_BitmapHeapPath;
 	bpath.path.pathtype = T_BitmapHeapScan;
 	bpath.path.parent = rel;
+	bpath.path.pathtarget = &(rel->reltarget);
 	bpath.path.param_info = get_baserel_parampathinfo(root, rel,
 													  required_outer);
 	bpath.path.pathkeys = NIL;
@ -1578,6 +1579,7 @@ bitmap_and_cost_est(PlannerInfo *root, RelOptInfo *rel, List *paths)
 	apath.path.type = T_BitmapAndPath;
 	apath.path.pathtype = T_BitmapAnd;
 	apath.path.parent = rel;
+	apath.path.pathtarget = &(rel->reltarget);
 	apath.path.param_info = NULL;		/* not used in bitmap trees */
 	apath.path.pathkeys = NIL;
 	apath.bitmapquals = paths;
@ -1590,6 +1592,7 @@ bitmap_and_cost_est(PlannerInfo *root, RelOptInfo *rel, List *paths)
 	bpath.path.type = T_BitmapHeapPath;
 	bpath.path.pathtype = T_BitmapHeapScan;
 	bpath.path.parent = rel;
+	bpath.path.pathtarget = &(rel->reltarget);
 	bpath.path.param_info = get_baserel_parampathinfo(root, rel,
 													  required_outer);
 	bpath.path.pathkeys = NIL;
@ -1809,10 +1812,10 @@ check_index_only(RelOptInfo *rel, IndexOptInfo *index)

 	/*
 	 * Add all the attributes needed for joins or final output.  Note: we must
-	 * look at reltargetlist, not the attr_needed data, because attr_needed
+	 * look at rel's targetlist, not the attr_needed data, because attr_needed
 	 * isn't computed for inheritance child rels.
 	 */
-	pull_varattnos((Node *) rel->reltargetlist, rel->relid, &attrs_used);
+	pull_varattnos((Node *) rel->reltarget.exprs, rel->relid, &attrs_used);

 	/* Add all the attributes used by restriction clauses. */
 	foreach(lc, rel->baserestrictinfo)
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@ -476,7 +476,7 @@ build_path_tlist(PlannerInfo *root, Path *path)
 	int			resno = 1;
 	ListCell   *v;

-	foreach(v, rel->reltargetlist)
+	foreach(v, rel->reltarget.exprs)
 	{
 		/* Do we really need to copy here?	Not sure */
 		Node	   *node = (Node *) copyObject(lfirst(v));
@ -875,9 +875,8 @@ create_result_plan(PlannerInfo *root, ResultPath *best_path)
 	List	   *tlist;
 	List	   *quals;

-	/* The tlist will be installed later, since we have no RelOptInfo */
-	Assert(best_path->path.parent == NULL);
-	tlist = NIL;
+	/* This is a bit useless currently, because rel will have empty tlist */
+	tlist = build_path_tlist(root, &best_path->path);

 	/* best_path->quals is just bare clauses */

@ -2183,7 +2182,7 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,
 	/*
 	 * If rel is a base relation, detect whether any system columns are
 	 * requested from the rel.  (If rel is a join relation, rel->relid will be
-	 * 0, but there can be no Var with relid 0 in the reltargetlist or the
+	 * 0, but there can be no Var with relid 0 in the rel's targetlist or the
 	 * restriction clauses, so we skip this in that case.  Note that any such
 	 * columns in base relations that were joined are assumed to be contained
 	 * in fdw_scan_tlist.)  This is a bit of a kluge and might go away someday,
@ -2198,10 +2197,10 @@ create_foreignscan_plan(PlannerInfo *root, ForeignPath *best_path,

 		/*
 		 * First, examine all the attributes needed for joins or final output.
-		 * Note: we must look at reltargetlist, not the attr_needed data,
+		 * Note: we must look at rel's targetlist, not the attr_needed data,
 		 * because attr_needed isn't computed for inheritance child rels.
 		 */
-		pull_varattnos((Node *) rel->reltargetlist, scan_relid, &attrs_used);
+		pull_varattnos((Node *) rel->reltarget.exprs, scan_relid, &attrs_used);

 		/* Add all the attributes used by restriction clauses. */
 		foreach(lc, rel->baserestrictinfo)
@ -3455,7 +3454,7 @@ copy_generic_path_info(Plan *dest, Path *src)
 		dest->startup_cost = src->startup_cost;
 		dest->total_cost = src->total_cost;
 		dest->plan_rows = src->rows;
-		dest->plan_width = src->parent->width;
+		dest->plan_width = src->pathtarget->width;
 		dest->parallel_aware = src->parallel_aware;
 	}
 	else
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@ -211,10 +211,11 @@ add_vars_to_targetlist(PlannerInfo *root, List *vars,
 			attno -= rel->min_attr;
 			if (rel->attr_needed[attno] == NULL)
 			{
-				/* Variable not yet requested, so add to reltargetlist */
+				/* Variable not yet requested, so add to rel's targetlist */
 				/* XXX is copyObject necessary here? */
-				rel->reltargetlist = lappend(rel->reltargetlist,
-											 copyObject(var));
+				rel->reltarget.exprs = lappend(rel->reltarget.exprs,
+											   copyObject(var));
+				/* reltarget cost and width will be computed later */
 			}
 			rel->attr_needed[attno] = bms_add_members(rel->attr_needed[attno],
 													  where_needed);
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@ -98,14 +98,16 @@ static List *reorder_grouping_sets(List *groupingSets, List *sortclause);
 static void standard_qp_callback(PlannerInfo *root, void *extra);
 static bool choose_hashed_grouping(PlannerInfo *root,
 					   double tuple_fraction, double limit_tuples,
-					   double path_rows, int path_width,
+					   double path_rows,
 					   Path *cheapest_path, Path *sorted_path,
 					   double dNumGroups, AggClauseCosts *agg_costs);
 static bool choose_hashed_distinct(PlannerInfo *root,
 					   double tuple_fraction, double limit_tuples,
-					   double path_rows, int path_width,
+					   double path_rows,
 					   Cost cheapest_startup_cost, Cost cheapest_total_cost,
+					   int cheapest_path_width,
 					   Cost sorted_startup_cost, Cost sorted_total_cost,
+					   int sorted_path_width,
 					   List *sorted_pathkeys,
 					   double dNumDistinctRows);
 static List *make_subplanTargetList(PlannerInfo *root, List *tlist,
@ -1467,7 +1469,6 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		AggClauseCosts agg_costs;
 		int			numGroupCols;
 		double		path_rows;
-		int			path_width;
 		bool		use_hashed_grouping = false;
 		WindowFuncLists *wflists = NULL;
 		List	   *activeWindows = NIL;
@ -1672,12 +1673,11 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 								  standard_qp_callback, &qp_extra);

 		/*
-		 * Extract rowcount and width estimates for use below.  If final_rel
-		 * has been proven dummy, its rows estimate will be zero; clamp it to
-		 * one to avoid zero-divide in subsequent calculations.
+		 * Extract rowcount estimate for use below.  If final_rel has been
+		 * proven dummy, its rows estimate will be zero; clamp it to one to
+		 * avoid zero-divide in subsequent calculations.
 		 */
 		path_rows = clamp_row_est(final_rel->rows);
-		path_width = final_rel->width;

 		/*
 		 * If there's grouping going on, estimate the number of result groups.
@ -1849,7 +1849,7 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 				/* Figure cost for sorting */
 				cost_sort(&sort_path, root, root->query_pathkeys,
 						  cheapest_path->total_cost,
-						  path_rows, path_width,
+						  path_rows, cheapest_path->pathtarget->width,
 						  0.0, work_mem, root->limit_tuples);
 			}

@ -1881,7 +1881,7 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 				use_hashed_grouping =
 					choose_hashed_grouping(root,
 										   tuple_fraction, limit_tuples,
-										   path_rows, path_width,
+										   path_rows,
 										   cheapest_path, sorted_path,
 										   dNumGroups, &agg_costs);
 			}
@ -1900,11 +1900,13 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 			use_hashed_distinct =
 				choose_hashed_distinct(root,
 									   tuple_fraction, limit_tuples,
-									   path_rows, path_width,
+									   path_rows,
 									   cheapest_path->startup_cost,
 									   cheapest_path->total_cost,
+									   cheapest_path->pathtarget->width,
 									   sorted_path->startup_cost,
 									   sorted_path->total_cost,
+									   sorted_path->pathtarget->width,
 									   sorted_path->pathkeys,
 									   dNumGroups);
 			tested_hashed_distinct = true;
@ -2343,11 +2345,12 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 				choose_hashed_distinct(root,
 									   tuple_fraction, limit_tuples,
 									   result_plan->plan_rows,
+									   result_plan->startup_cost,
+									   result_plan->total_cost,
 									   result_plan->plan_width,
 									   result_plan->startup_cost,
 									   result_plan->total_cost,
-									   result_plan->startup_cost,
-									   result_plan->total_cost,
+									   result_plan->plan_width,
 									   current_pathkeys,
 									   dNumDistinctRows);
 		}
@ -2678,10 +2681,13 @@ build_grouping_chain(PlannerInfo *root,
 * any logic that uses plan_rows to, eg, estimate qual evaluation costs.)
 *
 * Note: during initial stages of planning, we mostly consider plan nodes with
- * "flat" tlists, containing just Vars.  So their evaluation cost is zero
- * according to the model used by cost_qual_eval() (or if you prefer, the cost
- * is factored into cpu_tuple_cost).  Thus we can avoid accounting for tlist
- * cost throughout query_planner() and subroutines.  But once we apply a
+ * "flat" tlists, containing just Vars and PlaceHolderVars.  The evaluation
+ * cost of Vars is zero according to the model used by cost_qual_eval() (or if
+ * you prefer, the cost is factored into cpu_tuple_cost).  The evaluation cost
+ * of a PHV's expression is charged as part of the scan cost of whichever plan
+ * node first computes it, and then subsequent references to the PHV can be
+ * taken as having cost zero.  Thus we can avoid worrying about tlist cost
+ * as such throughout query_planner() and subroutines.  But once we apply a
 * tlist that might contain actual operators, sub-selects, etc, we'd better
 * account for its cost.  Any set-returning functions in the tlist must also
 * affect the estimated rowcount.
@ -3840,7 +3846,7 @@ standard_qp_callback(PlannerInfo *root, void *extra)
 static bool
 choose_hashed_grouping(PlannerInfo *root,
 					   double tuple_fraction, double limit_tuples,
-					   double path_rows, int path_width,
+					   double path_rows,
 					   Path *cheapest_path, Path *sorted_path,
 					   double dNumGroups, AggClauseCosts *agg_costs)
 {
@ -3853,6 +3859,7 @@ choose_hashed_grouping(PlannerInfo *root,
 	List	   *current_pathkeys;
 	Path		hashed_p;
 	Path		sorted_p;
+	int			sorted_p_width;

 	/*
 	 * Executor doesn't support hashed aggregation with DISTINCT or ORDER BY
@ -3890,7 +3897,8 @@ choose_hashed_grouping(PlannerInfo *root,
 	 */

 	/* Estimate per-hash-entry space at tuple width... */
-	hashentrysize = MAXALIGN(path_width) + MAXALIGN(SizeofMinimalTupleHeader);
+	hashentrysize = MAXALIGN(cheapest_path->pathtarget->width) +
+		MAXALIGN(SizeofMinimalTupleHeader);
 	/* plus space for pass-by-ref transition values... */
 	hashentrysize += agg_costs->transitionSpace;
 	/* plus the per-hash-entry overhead */
@ -3935,25 +3943,27 @@ choose_hashed_grouping(PlannerInfo *root,
 	/* Result of hashed agg is always unsorted */
 	if (target_pathkeys)
 		cost_sort(&hashed_p, root, target_pathkeys, hashed_p.total_cost,
-				  dNumGroups, path_width,
+				  dNumGroups, cheapest_path->pathtarget->width,
 				  0.0, work_mem, limit_tuples);

 	if (sorted_path)
 	{
 		sorted_p.startup_cost = sorted_path->startup_cost;
 		sorted_p.total_cost = sorted_path->total_cost;
+		sorted_p_width = sorted_path->pathtarget->width;
 		current_pathkeys = sorted_path->pathkeys;
 	}
 	else
 	{
 		sorted_p.startup_cost = cheapest_path->startup_cost;
 		sorted_p.total_cost = cheapest_path->total_cost;
+		sorted_p_width = cheapest_path->pathtarget->width;
 		current_pathkeys = cheapest_path->pathkeys;
 	}
 	if (!pathkeys_contained_in(root->group_pathkeys, current_pathkeys))
 	{
 		cost_sort(&sorted_p, root, root->group_pathkeys, sorted_p.total_cost,
-				  path_rows, path_width,
+				  path_rows, sorted_p_width,
 				  0.0, work_mem, -1.0);
 		current_pathkeys = root->group_pathkeys;
 	}
@ -3971,7 +3981,7 @@ choose_hashed_grouping(PlannerInfo *root,
 	if (target_pathkeys &&
 		!pathkeys_contained_in(target_pathkeys, current_pathkeys))
 		cost_sort(&sorted_p, root, target_pathkeys, sorted_p.total_cost,
-				  dNumGroups, path_width,
+				  dNumGroups, sorted_p_width,
 				  0.0, work_mem, limit_tuples);

 	/*
@ -4008,9 +4018,11 @@ choose_hashed_grouping(PlannerInfo *root,
 static bool
 choose_hashed_distinct(PlannerInfo *root,
 					   double tuple_fraction, double limit_tuples,
-					   double path_rows, int path_width,
+					   double path_rows,
 					   Cost cheapest_startup_cost, Cost cheapest_total_cost,
+					   int cheapest_path_width,
 					   Cost sorted_startup_cost, Cost sorted_total_cost,
+					   int sorted_path_width,
 					   List *sorted_pathkeys,
 					   double dNumDistinctRows)
 {
@ -4058,7 +4070,8 @@ choose_hashed_distinct(PlannerInfo *root,
 	 */

 	/* Estimate per-hash-entry space at tuple width... */
-	hashentrysize = MAXALIGN(path_width) + MAXALIGN(SizeofMinimalTupleHeader);
+	hashentrysize = MAXALIGN(cheapest_path_width) +
+		MAXALIGN(SizeofMinimalTupleHeader);
 	/* plus the per-hash-entry overhead */
 	hashentrysize += hash_agg_entry_size(0);

@ -4089,7 +4102,7 @@ choose_hashed_distinct(PlannerInfo *root,
 	 */
 	if (parse->sortClause)
 		cost_sort(&hashed_p, root, root->sort_pathkeys, hashed_p.total_cost,
-				  dNumDistinctRows, path_width,
+				  dNumDistinctRows, cheapest_path_width,
 				  0.0, work_mem, limit_tuples);

 	/*
@ -4113,7 +4126,7 @@ choose_hashed_distinct(PlannerInfo *root,
 		else
 			current_pathkeys = root->sort_pathkeys;
 		cost_sort(&sorted_p, root, current_pathkeys, sorted_p.total_cost,
-				  path_rows, path_width,
+				  path_rows, sorted_path_width,
 				  0.0, work_mem, -1.0);
 	}
 	cost_group(&sorted_p, root, numDistinctCols, dNumDistinctRows,
@ -4122,7 +4135,7 @@ choose_hashed_distinct(PlannerInfo *root,
 	if (parse->sortClause &&
 		!pathkeys_contained_in(root->sort_pathkeys, current_pathkeys))
 		cost_sort(&sorted_p, root, root->sort_pathkeys, sorted_p.total_cost,
-				  dNumDistinctRows, path_width,
+				  dNumDistinctRows, sorted_path_width,
 				  0.0, work_mem, limit_tuples);

 	/*
@ -4896,7 +4909,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	 * set_baserel_size_estimates, just do a quick hack for rows and width.
 	 */
 	rel->rows = rel->tuples;
-	rel->width = get_relation_data_width(tableOid, NULL);
+	rel->reltarget.width = get_relation_data_width(tableOid, NULL);

 	root->total_table_pages = rel->pages;

@ -4912,7 +4925,7 @@ plan_cluster_use_sort(Oid tableOid, Oid indexOid)
 	/* Estimate the cost of seq scan + sort */
 	seqScanPath = create_seqscan_path(root, rel, NULL, 0);
 	cost_sort(&seqScanAndSortPath, root, NIL,
-			  seqScanPath->total_cost, rel->tuples, rel->width,
+			  seqScanPath->total_cost, rel->tuples, rel->reltarget.width,
 			  comparisonCost, maintenance_work_mem, -1.0);

 	/* Estimate the cost of index scan */
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@ -3489,7 +3489,7 @@ eval_const_expressions_mutator(Node *node,
 				 * can optimize field selection from a RowExpr construct.
 				 *
 				 * However, replacing a whole-row Var in this way has a
-				 * pitfall: if we've already built the reltargetlist for the
+				 * pitfall: if we've already built the rel targetlist for the
 				 * source relation, then the whole-row Var is scheduled to be
 				 * produced by the relation scan, but the simple Var probably
 				 * isn't, which will lead to a failure in setrefs.c.  This is
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@ -929,6 +929,7 @@ create_seqscan_path(PlannerInfo *root, RelOptInfo *rel,

 	pathnode->pathtype = T_SeqScan;
 	pathnode->parent = rel;
+	pathnode->pathtarget = &(rel->reltarget);
 	pathnode->param_info = get_baserel_parampathinfo(root, rel,
 													 required_outer);
 	pathnode->parallel_aware = parallel_degree > 0 ? true : false;
@ -952,6 +953,7 @@ create_samplescan_path(PlannerInfo *root, RelOptInfo *rel, Relids required_outer

 	pathnode->pathtype = T_SampleScan;
 	pathnode->parent = rel;
+	pathnode->pathtarget = &(rel->reltarget);
 	pathnode->param_info = get_baserel_parampathinfo(root, rel,
 													 required_outer);
 	pathnode->parallel_aware = false;
@ -1008,6 +1010,7 @@ create_index_path(PlannerInfo *root,

 	pathnode->path.pathtype = indexonly ? T_IndexOnlyScan : T_IndexScan;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
 														  required_outer);
 	pathnode->path.parallel_aware = false;
@ -1056,6 +1059,7 @@ create_bitmap_heap_path(PlannerInfo *root,

 	pathnode->path.pathtype = T_BitmapHeapScan;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
 														  required_outer);
 	pathnode->path.parallel_aware = false;
@ -1085,6 +1089,7 @@ create_bitmap_and_path(PlannerInfo *root,

 	pathnode->path.pathtype = T_BitmapAnd;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = NULL;	/* not used in bitmap trees */

 	/*
@ -1120,6 +1125,7 @@ create_bitmap_or_path(PlannerInfo *root,

 	pathnode->path.pathtype = T_BitmapOr;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = NULL;	/* not used in bitmap trees */

 	/*
@ -1154,6 +1160,7 @@ create_tidscan_path(PlannerInfo *root, RelOptInfo *rel, List *tidquals,

 	pathnode->path.pathtype = T_TidScan;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
 														  required_outer);
 	pathnode->path.parallel_aware = false;
@ -1185,6 +1192,7 @@ create_append_path(RelOptInfo *rel, List *subpaths, Relids required_outer,

 	pathnode->path.pathtype = T_Append;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = get_appendrel_parampathinfo(rel,
 															required_outer);
 	pathnode->path.parallel_aware = false;
@ -1243,6 +1251,7 @@ create_merge_append_path(PlannerInfo *root,

 	pathnode->path.pathtype = T_MergeAppend;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = get_appendrel_parampathinfo(rel,
 															required_outer);
 	pathnode->path.parallel_aware = false;
@ -1290,7 +1299,7 @@ create_merge_append_path(PlannerInfo *root,
 					  pathkeys,
 					  subpath->total_cost,
 					  subpath->parent->tuples,
-					  subpath->parent->width,
+					  subpath->pathtarget->width,
 					  0.0,
 					  work_mem,
 					  pathnode->limit_tuples);
@ -1322,7 +1331,8 @@ create_result_path(RelOptInfo *rel, List *quals)
 	ResultPath *pathnode = makeNode(ResultPath);

 	pathnode->path.pathtype = T_Result;
-	pathnode->path.parent = NULL;
+	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = NULL;	/* there are no other rels... */
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel;
@ -1339,7 +1349,10 @@ create_result_path(RelOptInfo *rel, List *quals)
 	 * In theory we should include the qual eval cost as well, but at present
 	 * that doesn't accomplish much except duplicate work that will be done
 	 * again in make_result; since this is only used for degenerate cases,
-	 * nothing interesting will be done with the path cost values...
+	 * nothing interesting will be done with the path cost values.
+	 *
+	 * (Likewise, we don't worry about pathtarget->cost since that tlist will
+	 * be empty at this point.)
 	 */

 	return pathnode;
@ -1359,6 +1372,7 @@ create_material_path(RelOptInfo *rel, Path *subpath)

 	pathnode->path.pathtype = T_Material;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = subpath->parallel_safe;
@ -1371,7 +1385,7 @@ create_material_path(RelOptInfo *rel, Path *subpath)
 				  subpath->startup_cost,
 				  subpath->total_cost,
 				  subpath->rows,
-				  rel->width);
+				  subpath->pathtarget->width);

 	return pathnode;
 }
@ -1422,6 +1436,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,

 	pathnode->path.pathtype = T_Unique;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = subpath->parallel_safe;
@ -1516,7 +1531,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		cost_sort(&sort_path, root, NIL,
 				  subpath->total_cost,
 				  rel->rows,
-				  rel->width,
+				  subpath->pathtarget->width,
 				  0.0,
 				  work_mem,
 				  -1.0);
@ -1536,7 +1551,7 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 		 * Estimate the overhead per hashtable entry at 64 bytes (same as in
 		 * planner.c).
 		 */
-		int			hashentrysize = rel->width + 64;
+		int			hashentrysize = subpath->pathtarget->width + 64;

 		if (hashentrysize * pathnode->path.rows > work_mem * 1024L)
 		{
@ -1607,6 +1622,7 @@ create_gather_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,

 	pathnode->path.pathtype = T_Gather;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
 														  required_outer);
 	pathnode->path.parallel_aware = false;
@ -1672,6 +1688,7 @@ create_subqueryscan_path(PlannerInfo *root, RelOptInfo *rel,

 	pathnode->pathtype = T_SubqueryScan;
 	pathnode->parent = rel;
+	pathnode->pathtarget = &(rel->reltarget);
 	pathnode->param_info = get_baserel_parampathinfo(root, rel,
 													 required_outer);
 	pathnode->parallel_aware = false;
@ -1697,6 +1714,7 @@ create_functionscan_path(PlannerInfo *root, RelOptInfo *rel,

 	pathnode->pathtype = T_FunctionScan;
 	pathnode->parent = rel;
+	pathnode->pathtarget = &(rel->reltarget);
 	pathnode->param_info = get_baserel_parampathinfo(root, rel,
 													 required_outer);
 	pathnode->parallel_aware = false;
@ -1722,6 +1740,7 @@ create_valuesscan_path(PlannerInfo *root, RelOptInfo *rel,

 	pathnode->pathtype = T_ValuesScan;
 	pathnode->parent = rel;
+	pathnode->pathtarget = &(rel->reltarget);
 	pathnode->param_info = get_baserel_parampathinfo(root, rel,
 													 required_outer);
 	pathnode->parallel_aware = false;
@ -1746,6 +1765,7 @@ create_ctescan_path(PlannerInfo *root, RelOptInfo *rel, Relids required_outer)

 	pathnode->pathtype = T_CteScan;
 	pathnode->parent = rel;
+	pathnode->pathtarget = &(rel->reltarget);
 	pathnode->param_info = get_baserel_parampathinfo(root, rel,
 													 required_outer);
 	pathnode->parallel_aware = false;
@ -1771,6 +1791,7 @@ create_worktablescan_path(PlannerInfo *root, RelOptInfo *rel,

 	pathnode->pathtype = T_WorkTableScan;
 	pathnode->parent = rel;
+	pathnode->pathtarget = &(rel->reltarget);
 	pathnode->param_info = get_baserel_parampathinfo(root, rel,
 													 required_outer);
 	pathnode->parallel_aware = false;
@ -1806,6 +1827,7 @@ create_foreignscan_path(PlannerInfo *root, RelOptInfo *rel,

 	pathnode->path.pathtype = T_ForeignScan;
 	pathnode->path.parent = rel;
+	pathnode->path.pathtarget = &(rel->reltarget);
 	pathnode->path.param_info = get_baserel_parampathinfo(root, rel,
 														  required_outer);
 	pathnode->path.parallel_aware = false;
@ -1938,6 +1960,7 @@ create_nestloop_path(PlannerInfo *root,

 	pathnode->path.pathtype = T_NestLoop;
 	pathnode->path.parent = joinrel;
+	pathnode->path.pathtarget = &(joinrel->reltarget);
 	pathnode->path.param_info =
 		get_joinrel_parampathinfo(root,
 								  joinrel,
@ -2000,6 +2023,7 @@ create_mergejoin_path(PlannerInfo *root,

 	pathnode->jpath.path.pathtype = T_MergeJoin;
 	pathnode->jpath.path.parent = joinrel;
+	pathnode->jpath.path.pathtarget = &(joinrel->reltarget);
 	pathnode->jpath.path.param_info =
 		get_joinrel_parampathinfo(root,
 								  joinrel,
@ -2060,6 +2084,7 @@ create_hashjoin_path(PlannerInfo *root,

 	pathnode->jpath.path.pathtype = T_HashJoin;
 	pathnode->jpath.path.parent = joinrel;
+	pathnode->jpath.path.pathtarget = &(joinrel->reltarget);
 	pathnode->jpath.path.param_info =
 		get_joinrel_parampathinfo(root,
 								  joinrel,
--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@ -16,6 +16,7 @@
 #include "postgres.h"

 #include "nodes/nodeFuncs.h"
+#include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/planmain.h"
@ -388,8 +389,9 @@ add_placeholders_to_base_rels(PlannerInfo *root)
 		{
 			RelOptInfo *rel = find_base_rel(root, varno);

-			rel->reltargetlist = lappend(rel->reltargetlist,
-										 copyObject(phinfo->ph_var));
+			rel->reltarget.exprs = lappend(rel->reltarget.exprs,
+										   copyObject(phinfo->ph_var));
+			/* reltarget's cost and width fields will be updated later */
 		}
 	}
 }
@ -402,11 +404,10 @@ add_placeholders_to_base_rels(PlannerInfo *root)
 *
 * A join rel should emit a PlaceHolderVar if (a) the PHV is needed above
 * this join level and (b) the PHV can be computed at or below this level.
- * At this time we do not need to distinguish whether the PHV will be
- * computed here or copied up from below.
 */
 void
-add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel)
+add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
+							RelOptInfo *outer_rel, RelOptInfo *inner_rel)
 {
 	Relids		relids = joinrel->relids;
 	ListCell   *lc;
@ -422,9 +423,32 @@ add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel)
 			if (bms_is_subset(phinfo->ph_eval_at, relids))
 			{
 				/* Yup, add it to the output */
-				joinrel->reltargetlist = lappend(joinrel->reltargetlist,
-												 phinfo->ph_var);
-				joinrel->width += phinfo->ph_width;
+				joinrel->reltarget.exprs = lappend(joinrel->reltarget.exprs,
+												   phinfo->ph_var);
+				joinrel->reltarget.width += phinfo->ph_width;
+
+				/*
+				 * Charge the cost of evaluating the contained expression if
+				 * the PHV can be computed here but not in either input.  This
+				 * is a bit bogus because we make the decision based on the
+				 * first pair of possible input relations considered for the
+				 * joinrel.  With other pairs, it might be possible to compute
+				 * the PHV in one input or the other, and then we'd be double
+				 * charging the PHV's cost for some join paths.  For now, live
+				 * with that; but we might want to improve it later by
+				 * refiguring the reltarget costs for each pair of inputs.
+				 */
+				if (!bms_is_subset(phinfo->ph_eval_at, outer_rel->relids) &&
+					!bms_is_subset(phinfo->ph_eval_at, inner_rel->relids))
+				{
+					QualCost	cost;
+
+					cost_qual_eval_node(&cost, (Node *) phinfo->ph_var->phexpr,
+										root);
+					joinrel->reltarget.cost.startup += cost.startup;
+					joinrel->reltarget.cost.per_tuple += cost.per_tuple;
+				}
+
 				/* Adjust joinrel's direct_lateral_relids as needed */
 				joinrel->direct_lateral_relids =
 					bms_add_members(joinrel->direct_lateral_relids,
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@ -102,12 +102,14 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->reloptkind = reloptkind;
 	rel->relids = bms_make_singleton(relid);
 	rel->rows = 0;
-	rel->width = 0;
 	/* cheap startup cost is interesting iff not all tuples to be retrieved */
 	rel->consider_startup = (root->tuple_fraction > 0);
 	rel->consider_param_startup = false;		/* might get changed later */
 	rel->consider_parallel = false;		/* might get changed later */
-	rel->reltargetlist = NIL;
+	rel->reltarget.exprs = NIL;
+	rel->reltarget.cost.startup = 0;
+	rel->reltarget.cost.per_tuple = 0;
+	rel->reltarget.width = 0;
 	rel->pathlist = NIL;
 	rel->ppilist = NIL;
 	rel->partial_pathlist = NIL;
@ -387,12 +389,14 @@ build_join_rel(PlannerInfo *root,
 	joinrel->reloptkind = RELOPT_JOINREL;
 	joinrel->relids = bms_copy(joinrelids);
 	joinrel->rows = 0;
-	joinrel->width = 0;
 	/* cheap startup cost is interesting iff not all tuples to be retrieved */
 	joinrel->consider_startup = (root->tuple_fraction > 0);
 	joinrel->consider_param_startup = false;
 	joinrel->consider_parallel = false;
-	joinrel->reltargetlist = NIL;
+	joinrel->reltarget.exprs = NIL;
+	joinrel->reltarget.cost.startup = 0;
+	joinrel->reltarget.cost.per_tuple = 0;
+	joinrel->reltarget.width = 0;
 	joinrel->pathlist = NIL;
 	joinrel->ppilist = NIL;
 	joinrel->partial_pathlist = NIL;
@ -459,7 +463,7 @@ build_join_rel(PlannerInfo *root,
 	 */
 	build_joinrel_tlist(root, joinrel, outer_rel);
 	build_joinrel_tlist(root, joinrel, inner_rel);
-	add_placeholders_to_joinrel(root, joinrel);
+	add_placeholders_to_joinrel(root, joinrel, outer_rel, inner_rel);

 	/*
 	 * add_placeholders_to_joinrel also took care of adding the ph_lateral
@ -609,7 +613,7 @@ build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
 	Relids		relids = joinrel->relids;
 	ListCell   *vars;

-	foreach(vars, input_rel->reltargetlist)
+	foreach(vars, input_rel->reltarget.exprs)
 	{
 		Var		   *var = (Var *) lfirst(vars);
 		RelOptInfo *baserel;
@ -628,7 +632,7 @@ build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
 		 * rels, which will never be seen here.)
 		 */
 		if (!IsA(var, Var))
-			elog(ERROR, "unexpected node type in reltargetlist: %d",
+			elog(ERROR, "unexpected node type in rel targetlist: %d",
 				 (int) nodeTag(var));

 		/* Get the Var's original base rel */
@ -639,8 +643,9 @@ build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
 		if (bms_nonempty_difference(baserel->attr_needed[ndx], relids))
 		{
 			/* Yup, add it to the output */
-			joinrel->reltargetlist = lappend(joinrel->reltargetlist, var);
-			joinrel->width += baserel->attr_widths[ndx];
+			joinrel->reltarget.exprs = lappend(joinrel->reltarget.exprs, var);
+			/* Vars have cost zero, so no need to adjust reltarget.cost */
+			joinrel->reltarget.width += baserel->attr_widths[ndx];
 		}
 	}
 }
@ -826,7 +831,6 @@ build_empty_join_rel(PlannerInfo *root)
 	joinrel->reloptkind = RELOPT_JOINREL;
 	joinrel->relids = NULL;		/* empty set */
 	joinrel->rows = 1;			/* we produce one row for such cases */
-	joinrel->width = 0;			/* it contains no Vars */
 	joinrel->rtekind = RTE_JOIN;

 	root->join_rel_list = lappend(root->join_rel_list, joinrel);
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@ -61,6 +61,25 @@ typedef struct AggClauseCosts
 	Size		transitionSpace;	/* space for pass-by-ref transition data */
 } AggClauseCosts;

+/*
+ * This struct contains what we need to know during planning about the
+ * targetlist (output columns) that a Path will compute.  Each RelOptInfo
+ * includes a default PathTarget, which its individual Paths may merely point
+ * to.  However, in some cases a Path may compute outputs different from other
+ * Paths, and in that case we make a custom PathTarget struct for it.  For
+ * example, an indexscan might return index expressions that would otherwise
+ * need to be explicitly calculated.
+ *
+ * Note that PathTarget.exprs is just a list of expressions; they do not have
+ * TargetEntry nodes on top, though those will appear in the finished Plan.
+ */
+typedef struct PathTarget
+{
+	List	   *exprs;			/* list of expressions to be computed */
+	QualCost	cost;			/* cost of evaluating the above */
+	int			width;			/* estimated avg width of result tuples */
+} PathTarget;
+

 /*----------
 * PlannerGlobal
@ -334,17 +353,16 @@ typedef struct PlannerInfo
 *				if there is just one, a join relation if more than one
 *		rows - estimated number of tuples in the relation after restriction
 *			   clauses have been applied (ie, output rows of a plan for it)
- *		width - avg. number of bytes per tuple in the relation after the
- *				appropriate projections have been done (ie, output width)
 *		consider_startup - true if there is any value in keeping plain paths for
 *						   this rel on the basis of having cheap startup cost
 *		consider_param_startup - the same for parameterized paths
- *		reltargetlist - List of Var and PlaceHolderVar nodes for the values
- *						we need to output from this relation.
- *						List is in no particular order, but all rels of an
- *						appendrel set must use corresponding orders.
- *						NOTE: in an appendrel child relation, may contain
- *						arbitrary expressions pulled up from a subquery!
+ *		reltarget - Default Path output tlist for this rel; normally contains
+ *					Var and PlaceHolderVar nodes for the values we need to
+ *					output from this relation.
+ *					List is in no particular order, but all rels of an
+ *					appendrel set must use corresponding orders.
+ *					NOTE: in an appendrel child relation, may contain
+ *					arbitrary expressions pulled up from a subquery!
 *		pathlist - List of Path nodes, one for each potentially useful
 *				   method of generating the relation
 *		ppilist - ParamPathInfo nodes for parameterized Paths, if any
@ -451,15 +469,16 @@ typedef struct RelOptInfo

 	/* size estimates generated by planner */
 	double		rows;			/* estimated number of result tuples */
-	int			width;			/* estimated avg width of result tuples */

 	/* per-relation planner control flags */
 	bool		consider_startup;		/* keep cheap-startup-cost paths? */
 	bool		consider_param_startup; /* ditto, for parameterized paths? */
 	bool		consider_parallel;		/* consider parallel paths? */

+	/* default result targetlist for Paths scanning this relation */
+	PathTarget	reltarget;		/* list of Vars/Exprs, cost, width */
+
 	/* materialization information */
-	List	   *reltargetlist;	/* Vars to be output by scan of relation */
 	List	   *pathlist;		/* Path structures */
 	List	   *ppilist;		/* ParamPathInfos used in pathlist */
 	List	   *partial_pathlist;	/* partial Paths */
@ -744,6 +763,11 @@ typedef struct ParamPathInfo
 * the same Path type for multiple Plan types when there is no need to
 * distinguish the Plan type during path processing.
 *
+ * "parent" identifies the relation this Path scans, and "pathtarget"
+ * describes the precise set of output columns the Path would compute.
+ * In simple cases all Paths for a given rel share the same targetlist,
+ * which we represent by having path->pathtarget point to parent->reltarget.
+ *
 * "param_info", if not NULL, links to a ParamPathInfo that identifies outer
 * relation(s) that provide parameter values to each scan of this path.
 * That means this path can only be joined to those rels by means of nestloop
@ -765,7 +789,10 @@ typedef struct Path
 	NodeTag		pathtype;		/* tag identifying scan/join method */

 	RelOptInfo *parent;			/* the relation this path can build */
+	PathTarget *pathtarget;		/* list of Vars/Exprs, cost, width */
+
 	ParamPathInfo *param_info;	/* parameterization info, or NULL if none */
+
 	bool		parallel_aware; /* engage parallel-aware logic? */
 	bool		parallel_safe;	/* OK to use as part of parallel plan? */
 	int			parallel_degree; /* desired parallel degree; 0 = not parallel */
--- a/src/include/optimizer/placeholder.h
+++ b/src/include/optimizer/placeholder.h
@ -26,7 +26,7 @@ extern void update_placeholder_eval_levels(PlannerInfo *root,
 							   SpecialJoinInfo *new_sjinfo);
 extern void fix_placeholder_input_needed_levels(PlannerInfo *root);
 extern void add_placeholders_to_base_rels(PlannerInfo *root);
-extern void add_placeholders_to_joinrel(PlannerInfo *root,
-							RelOptInfo *joinrel);
+extern void add_placeholders_to_joinrel(PlannerInfo *root, RelOptInfo *joinrel,
+							RelOptInfo *outer_rel, RelOptInfo *inner_rel);

 #endif   /* PLACEHOLDER_H */
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@ -4682,24 +4682,24 @@ select v.* from
  lateral (select x.q1,y.q1 union all select x.q2,y.q2) v(vx,vy);
        vx         |        vy         
 -------------------+-------------------
-               123 |                  
-               456 |                  
-               123 |  4567890123456789
-  4567890123456789 | -4567890123456789
-               123 |  4567890123456789
-  4567890123456789 |  4567890123456789
-               123 |  4567890123456789
-  4567890123456789 |               123
-  4567890123456789 |               123
-               123 |  4567890123456789
  4567890123456789 |               123
               123 |               456
-  4567890123456789 |  4567890123456789
-  4567890123456789 | -4567890123456789
-  4567890123456789 |  4567890123456789
-  4567890123456789 |  4567890123456789
+  4567890123456789 |               123
+               123 |  4567890123456789
  4567890123456789 |  4567890123456789
  4567890123456789 |               123
+               123 |  4567890123456789
+  4567890123456789 |               123
+  4567890123456789 |  4567890123456789
+  4567890123456789 |  4567890123456789
+               123 |  4567890123456789
+  4567890123456789 |  4567890123456789
+  4567890123456789 |  4567890123456789
+  4567890123456789 | -4567890123456789
+               123 |  4567890123456789
+  4567890123456789 | -4567890123456789
+               123 |                  
+               456 |                  
  4567890123456789 |                  
 -4567890123456789 |                  
 (20 rows)
@ -4713,24 +4713,24 @@ select v.* from
  lateral (select x.q1,y.q1 from dual union all select x.q2,y.q2 from dual) v(vx,vy);
        vx         |        vy         
 -------------------+-------------------
-               123 |                  
-               456 |                  
-               123 |  4567890123456789
-  4567890123456789 | -4567890123456789
-               123 |  4567890123456789
-  4567890123456789 |  4567890123456789
-               123 |  4567890123456789
-  4567890123456789 |               123
-  4567890123456789 |               123
-               123 |  4567890123456789
  4567890123456789 |               123
               123 |               456
-  4567890123456789 |  4567890123456789
-  4567890123456789 | -4567890123456789
-  4567890123456789 |  4567890123456789
-  4567890123456789 |  4567890123456789
+  4567890123456789 |               123
+               123 |  4567890123456789
  4567890123456789 |  4567890123456789
  4567890123456789 |               123
+               123 |  4567890123456789
+  4567890123456789 |               123
+  4567890123456789 |  4567890123456789
+  4567890123456789 |  4567890123456789
+               123 |  4567890123456789
+  4567890123456789 |  4567890123456789
+  4567890123456789 |  4567890123456789
+  4567890123456789 | -4567890123456789
+               123 |  4567890123456789
+  4567890123456789 | -4567890123456789
+               123 |                  
+               456 |                  
  4567890123456789 |                  
 -4567890123456789 |                  
 (20 rows)