golangGC

有关golang垃圾回收一些自己的理解

看看官方是怎么解释的

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
// Garbage collector (GC).
//
// The GC runs concurrently with mutator threads, is type accurate (aka precise), allows multiple
// GC thread to run in parallel. It is a concurrent mark and sweep that uses a write barrier. It is
// non-generational and non-compacting. Allocation is done using size segregated per P allocation
// areas to minimize fragmentation while eliminating locks in the common case.
//
// The algorithm decomposes into several steps.
// This is a high level description of the algorithm being used. For an overview of GC a good
// place to start is Richard Jones' gchandbook.org.
//
// The algorithm's intellectual heritage includes Dijkstra's on-the-fly algorithm, see
// Edsger W. Dijkstra, Leslie Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. 1978.
// On-the-fly garbage collection: an exercise in cooperation. Commun. ACM 21, 11 (November 1978),
// 966-975.
// For journal quality proofs that these steps are complete, correct, and terminate see
// Hudson, R., and Moss, J.E.B. Copying Garbage Collection without stopping the world.
// Concurrency and Computation: Practice and Experience 15(3-5), 2003.
//
// 1. GC performs sweep termination.
//
// a. Stop the world. This causes all Ps to reach a GC safe-point.
//
// b. Sweep any unswept spans. There will only be unswept spans if
// this GC cycle was forced before the expected time.
//
// 2. GC performs the mark phase.
//
// a. Prepare for the mark phase by setting gcphase to _GCmark
// (from _GCoff), enabling the write barrier, enabling mutator
// assists, and enqueueing root mark jobs. No objects may be
// scanned until all Ps have enabled the write barrier, which is
// accomplished using STW.
//
// b. Start the world. From this point, GC work is done by mark
// workers started by the scheduler and by assists performed as
// part of allocation. The write barrier shades both the
// overwritten pointer and the new pointer value for any pointer
// writes (see mbarrier.go for details). Newly allocated objects
// are immediately marked black.
//
// c. GC performs root marking jobs. This includes scanning all
// stacks, shading all globals, and shading any heap pointers in
// off-heap runtime data structures. Scanning a stack stops a
// goroutine, shades any pointers found on its stack, and then
// resumes the goroutine.
//
// d. GC drains the work queue of grey objects, scanning each grey
// object to black and shading all pointers found in the object
// (which in turn may add those pointers to the work queue).
//
// e. Because GC work is spread across local caches, GC uses a
// distributed termination algorithm to detect when there are no
// more root marking jobs or grey objects (see gcMarkDone). At this
// point, GC transitions to mark termination.
//
// 3. GC performs mark termination.
//
// a. Stop the world.
//
// b. Set gcphase to _GCmarktermination, and disable workers and
// assists.
//
// c. Perform housekeeping like flushing mcaches.
//
// 4. GC performs the sweep phase.
//
// a. Prepare for the sweep phase by setting gcphase to _GCoff,
// setting up sweep state and disabling the write barrier.
//
// b. Start the world. From this point on, newly allocated objects
// are white, and allocating sweeps spans before use if necessary.
//
// c. GC does concurrent sweeping in the background and in response
// to allocation. See description below.
//
// 5. When sufficient allocation has taken place, replay the sequence
// starting with 1 above. See discussion of GC rate below.
// Concurrent sweep.
//
// The sweep phase proceeds concurrently with normal program execution.
// The heap is swept span-by-span both lazily (when a goroutine needs another span)
// and concurrently in a background goroutine (this helps programs that are not CPU bound).
// At the end of STW mark termination all spans are marked as "needs sweeping".
//
// The background sweeper goroutine simply sweeps spans one-by-one.
//
// To avoid requesting more OS memory while there are unswept spans, when a
// goroutine needs another span, it first attempts to reclaim that much memory
// by sweeping. When a goroutine needs to allocate a new small-object span, it
// sweeps small-object spans for the same object size until it frees at least
// one object. When a goroutine needs to allocate large-object span from heap,
// it sweeps spans until it frees at least that many pages into heap. There is
// one case where this may not suffice: if a goroutine sweeps and frees two
// nonadjacent one-page spans to the heap, it will allocate a new two-page
// span, but there can still be other one-page unswept spans which could be
// combined into a two-page span.
//
// It's critical to ensure that no operations proceed on unswept spans (that would corrupt
// mark bits in GC bitmap). During GC all mcaches are flushed into the central cache,
// so they are empty. When a goroutine grabs a new span into mcache, it sweeps it.
// the span is swept (either by sweeping it, or by waiting for the concurrent sweep to finish).
// The finalizer goroutine is kicked off only when all spans are swept.
// When the next GC starts, it sweeps all not-yet-swept spans (if any).
// GC rate.
// Next GC is after we've allocated an extra amount of memory proportional to
// the amount already in use. The proportion is controlled by GOGC environment variable
// (100 by default). If GOGC=100 and we're using 4M, we'll GC again when we get to 8M
// (this mark is tracked in next_gc variable). This keeps the GC cost in linear
// proportion to the allocation cost. Adjusting GOGC just changes the linear constant
// (and also the amount of extra memory used).
// Oblets
//
// In order to prevent long pauses while scanning large objects and to
// improve parallelism, the garbage collector breaks up scan jobs for
// objects larger than maxObletBytes into "oblets" of at most
// maxObletBytes. When scanning encounters the beginning of a large
// object, it scans only the first oblet and enqueues the remaining
// oblets as new scan jobs.

带着自己来的理解翻译一下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
垃圾收集器(GC)。
GC 与 mutator 线程(这里的mutator线程,可理解为GC的增幅器,也就是协助GC的gorutine)同时运行,允许多个
GC 线程可以并行运行。使用写屏障的并发标记和清除。是
非世代和非紧凑的。分配使用每个 P 分配的大小进行分配
区域以最大程度地减少碎片,同时消除常见情况下的锁定。
该算法分解为几个步骤。
1. GC进入到清除结束阶段.
a. STW. 使所有的逻辑调度单元(PGM模型中的P)进入 GC安全的点(可以理解成进入准备GC的模式)
b. 清理所有未清理的span(span的定义在 golang 的内存分配中有介绍).只有在前一次GC结束之前强制进入本次GC再会有这个环节。
2. GC进入标记阶段.
a. 准备阶段,会把GC标记位从 [_GcOff] 置为 [_GCMark],启用 [写屏障],启用协助携程,开始入队从Root开始标记的任务。
直到所有的P都启用[写屏障之后]之前不回去扫描任何对象,直到这一步,都是在STW的前提下完成的。
b. StartTW。从这里开始,GC的工作是由调度器启动的标记工作者 和 协助线程启动的标记工作人员完成的。写屏障会把所有新分配的对象和有更新的对象都标记为黑色。
c.GC进入标记工作。这包括扫描所有堆栈,对所有全局变量着色以及对堆外运行时数据结构中的所有堆指针进行着色。 扫描堆栈将停止 goroutine,对在其堆栈上找到的所有指针加阴影,然后恢复 goroutine。
d. GC 清空灰色对象的工作队列,将每个灰色对象扫描为黑色,并对在该对象中找到的所有指针加阴影也就是标记为灰色(这又会将这些指针添加到工作队列中)。
e. 由于 GC 工作分散在本地缓存中,因此 GC 使用分布式终止算法来检测何时不再有根标记作业或灰色对象(请参阅 gcMarkDone)。 此时,GC 转换为标记终止[mark termination]。
3. GC 进入标记终止阶段.
a. STW.
b. GC 标记为置为[_GCmarktermination], 禁用所有标记gorutine和协助线程
c. 处理一些内部工作 比如(刷新 mcaches ,有关mchahes请查阅golang的内存分配)
4. GC进入清理阶段.
a. 准备阶段设置GC标记为为 [_GCoff],进入清理状态,且解除写屏障
Start the world,
b.Start the world,从这一点开始,新分配的对象为白色,且在必要时可以分配为清理的span
c. GC 在后台并响应分配进行并发清除。

垃圾回收的一些时机

Golang gc 的触发方式

  1. 自动检测 heap上超过阈值(起始32k)
  2. 用户主动调用
  3. 定时2分钟触发gc
  4. GC调优的一些建议

  5. 合理化内存分配的速度、提高赋值器的 CPU 利用率
    • 在遇到需要例如 gorutine或者新的对象等情况下,按需分配,用时分配,而不是一开始就大良分配,过多的对象或者过多的routine或导致调度器的压力
  6. 降低并复用已经申请的内存
    • 类似于sync.pool 等用来存储可以复用的对象,来降低频繁的标记清理。
  7. 调整 GOGC
    • 我们已经知道了 GC 的触发原则是由步调算法来控制的,其关键在于估计下一次需要触发 GC 时,堆的大小。可想而知,如果我们在遇到海量请求的时,为了避免 GC 频繁触发,是否可以通过将 GOGC 的值设置得更大,让 GC 触发的时间变得更晚,从而减少其触发频率,进而增加用户代码对机器的使用率呢?答案是肯定的

溜了~