Arthas-guide

default

markdown

# Arthas - Java 线上问题定位处理的终极利器

来源: https://www.wdbyte.com/2019/11/arthas/

<div class="main" id="main">
	<div class="post-time" style="font-size: .9em;">
		<p>
			<i class="fa fa-calendar"/> 最后更新日期：<span>2019-11-06</span> 
		</p>
	</div>
	<div class="post-toc" id="toc">
	   <div>
	      <h3>常用命令,建议收藏</h3>

1.容器终端进入arthas
```shell
cd /opt/arthas/
/opt/arthas # java -jar arthas-boot.jar
1
```

2.常用命令

```java

1.查看耗时>200ms的方法的参数和返回值
watch com.sie.snest.engine.model.property.SelectionProperty convertToRead '{params[0],params[3],returnObj}' '#cost>200' -n 100  -x 3

2. 统计耗时
trace com.sie.snest.engine.data.access.BussModelDataAccess search 'params[0].getModel().getName()=="equip_lubrication_calibration"' -n 100 --skipJDKMethod false

trace com.sie.mbm.edo.calibration.models.Calibration periodUnitList  -n 100 --skipJDKMethod false

3.查看参数
watch com.sie.mi.ioc.attribution.strategy.models.AttributionStrategy calcAttribution  '{params,returnObj,throwExp}' -n 20  -x 3

watch com.sie.mi.ioc.attribution.strategy.models.AttributionStrategy calcAttribution  '{params,returnObj,throwExp}' 'params[0]=="04dkj8y9h3kny"'  -n 20  -x 3

4. 查看异常不为空
watch com.sie.mi.ioc.attribution.strategy.models.AttributionStrategy calcAttribution  '{params,returnObj,throwExp}' 'throwExp != null'  -n 20  -x 3

4.其他示例

watch com.sie.snest.engine.api.response.ResponseHandle beforeBodyWrite  '{params,returnObj,throwExp}' -n 100 --skipJDKMethod false

ognl --classLoaderClass org.springframework.boot.loader.LaunchedURLClassLoader '@com.sie.snest.engine.container.EngineContainer@getBussinessAppGroupContainer().getAppDataInfoMap().keySet()'
 
 
 watch com.sie.snest
trace com.sie.snest.engine.api.distributed.RpcInvocation invoke -n 5 --skipJDKMethod false

trace com.sie.snest.engine.api.distributed.RpcInvocation invoke 'ModelMeta == "mbm_mes_process_match_rule"'

stack com.sie.snest.engine.api.response.ResponseHandle beforeBodyWrite  -n 20 --skipJDKMethod false

trace org.springframework.web.servlet.DispatcherServlet doDispatch  -n 20 --skipJDKMethod false

watch com.sie.mi.ioc.attribution.strategy.models.AttributionStrategy calcAttribution '{params,returnObj.reportResult.candidateResult.dimList[0].dataList[0].reportData,throwExp}' 'params[0]=="04dkj8y9h3kny"' -n 20  -x 3

watch com.sie.mi.ioc.attribution.strategy.models.AttributionStrategy calcAttribution '{params,returnObj,throwExp}' 'params[0]=="04dkj8y9h3kny"' -n 20  -x 3

watch com.sie.snest.engine.api.RpcController service  '{params,returnObj,throwExp}' -n 20  -x 3

watch com.sie.snest.engine.api.response.ResponseHandle beforeBodyWrite '{params,returnObj,throwExp}'  -n 100  -x 3

trace com.sie.snest.engine.api.distributed.RpcInvocation invoke 'params[3]=="mbm_mes_process_match_rule"' -n 100 --skipJDKMethod false

trace com.sie.snest.engine.api.RpcController service  'params[1].getParameter("businessIndex")=="04dkj8y9h3kny"' -n 20 --skipJDKMethod false

watch com.sie.snest.engine.api.RpcController service '{params,returnObj,throwExp}' 'params[1].getParameter("businessIndex")=="04dkj8y9h3kny"' -n 100  -x 3

trace com.sie.snest.engine.api.RpcController service  'params[1].getParameter("businessIndex")=="04dkj8y9h3kny"' -n 20 --skipJDKMethod false

trace com.sie.snest.engine.api.response.ResponseHandle beforeBodyWrite  -n 20 --skipJDKMethod false

```
	   </div>
		<div>
			<b>目录</b>
		</div>
		<div>
			<a style="margin: 0px;" href="#前言">前言</a>
		</div>
		<div>
			<a style="margin: 0px;" href="#1arthas--介绍">1、Arthas 介绍</a>
		</div>
		<div>
			<a style="margin: 0px;" href="#2arthas--使用场景">2、Arthas 使用场景</a>
		</div>
		<div>
			<a style="margin: 0px;" href="#3arthas--怎么用">3、Arthas 怎么用</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#31-安装">3.1 安装</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#32-运行">3.2 运行</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#33-web-console">3.3 web console</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#34-常用命令">3.4 常用命令</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#35-退出">3.5 退出</a>
		</div>
		<div>
			<a style="margin: 0px;" href="#4arthas-常用操作">4、Arthas 常用操作</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#41-全局监控">4.1 全局监控</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#42-cpu-为什么起飞了">4.2 CPU 为什么起飞了</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#43-线程池线程状态">4.3 线程池线程状态</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#44-线程死锁">4.4 线程死锁</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#45-反编译">4.5 反编译</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#46-查看字段信息">4.6 查看字段信息</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#47-查看方法信息">4.7 查看方法信息</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#48-对变量的值很是好奇">4.8 对变量的值很是好奇</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#49-程序有没有问题">4.9 程序有没有问题</a>
		</div>
		<div>
			<a style="margin: 40px;" href="#491-运行较慢耗时较长">4.9.1 运行较慢、耗时较长</a>
		</div>
		<div>
			<a style="margin: 40px;" href="#492-统计方法耗时">4.9.2 统计方法耗时</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#410-想观察方法信息">4.10 想观察方法信息</a>
		</div>
		<div>
			<a style="margin: 40px;" href="#4101-观察方法的入参出参信息">4.10.1 观察方法的入参出参信息</a>
		</div>
		<div>
			<a style="margin: 40px;" href="#4102-观察方法的调用路径">4.10.2 观察方法的调用路径</a>
		</div>
		<div>
			<a style="margin: 40px;" href="#4103-方法调用时空隧道">4.10.3 方法调用时空隧道</a>
		</div>
		<div>
			<a style="margin: 20px;" href="#45-火焰图分析">4.5. 火焰图分析</a>
		</div>
		<div>
			<a style="margin: 40px;" href="#451使用案例">4.5.1.使用案例</a>
		</div>
		<div>
			<a style="margin: 40px;" href="#452-复杂命令">4.5.2. 复杂命令</a>
		</div>
	</div>
	<div class="post-content">
		<p>
			<img src="https://cdn.debug.group/git/2019/arthas-1572972116473.png" alt="Arthas logo"/>
		</p>
		<h1 id="前言">前言</h1>
		<p>在使用 <strong>Arthas</strong> 之前，当遇到 Java 线上问题时，如 CPU 飙升、负载突高、内存溢出等问题，你需要查命令，查网络，然后 jps、jstack、jmap、jhat、jstat、hprof 等一通操作。最终焦头烂额，还不一定能查出问题所在。而现在，大多数的常见问题你都可以使用 <strong>Arthas</strong> 轻松定位，迅速解决，及时止损，准时下班。</p>
		<p>
			<img src="https://cdn.debug.group/img/23/02/Xnip2023-02-21_21-19-50.jpeg" alt=""/>
		</p>
		<h1 id="1arthas--介绍">1、Arthas  介绍</h1>
		<p>
			<strong>Arthas</strong> 是 <code>Alibaba</code> 在 2018 年 9 月开源的 <strong>Java 诊断</strong>工具。支持 <code>JDK6+</code>， 采用命令行交互模式，提供 <code>Tab</code> 自动补全，可以方便的定位和诊断线上程序运行问题。截至本篇文章编写时，已经收获 <code>Star</code> 17000+。</p>
		<p>
			<strong>Arthas</strong> 官方文档十分详细，本文也参考了官方文档内容，同时在开源在的 <code>Github</code> 的项目里的 <code>Issues</code> 里不仅有问题反馈，更有大量的使用案例，也可以进行学习参考。</p>
		<p>开源地址：<em>https://github.com/alibaba/arthas</em>
		</p>
		<p>官方文档：<em>https://alibaba.github.io/arthas</em>
		</p>
		
		<h1 id="2arthas--使用场景">2、Arthas  使用场景</h1>
		<p>得益于 <strong>Arthas</strong> 强大且丰富的功能，让 <strong>Arthas</strong> 能做的事情超乎想象。下面仅仅列举几项常见的使用情况，更多的使用场景可以在熟悉了 <strong>Arthas</strong> 之后自行探索。</p>
		<ol>
			<li>是否有一个全局视角来查看系统的运行状况？</li>
			<li>为什么 CPU 又升高了，到底是哪里占用了 CPU ？</li>
			<li>运行的多线程有死锁吗？有阻塞吗？</li>
			<li>程序运行耗时很长，是哪里耗时比较长呢？如何监测呢？</li>
			<li>这个类从哪个 jar 包加载的？为什么会报各种类相关的 Exception？</li>
			<li>我改的代码为什么没有执行到？难道是我没 commit？分支搞错了？</li>
			<li>遇到问题无法在线上 debug，难道只能通过加日志再重新发布吗？</li>
			<li>有什么办法可以监控到 JVM 的实时运行状态？</li>
		</ol>
		<h1 id="3arthas--怎么用">3、Arthas  怎么用</h1>
		<p>前文已经提到，<strong>Arthas</strong> 是一款命令行交互模式的 Java 诊断工具，由于是 Java 编写，所以可以直接下载相应 的 jar 包运行。</p>
		<h2 id="31-安装">3.1 安装</h2>
		<p>可以在官方 Github 上进行下载，如果速度较慢，可以尝试国内的码云 Gitee 下载。</p>
		<pre>
			<code class="language-shell"># github下载
wget https://alibaba.github.io/arthas/arthas-boot.jar
# 或者 Gitee 下载
wget https://arthas.gitee.io/arthas-boot.jar
# 打印帮助信息
java -jar arthas-boot.jar -h
</code>
		</pre>
		<h2 id="32-运行">3.2 运行</h2>
		<p>
			<strong>Arthas</strong> 只是一个 java 程序，所以可以直接用 <code>java -jar</code> 运行。运行时或者运行之后要选择要监测的 Java 进程。</p>
		<pre>
			<code class="language-shell"># 运行方式1，先运行，在选择 Java 进程 PID
java -jar arthas-boot.jar
# 选择进程(输入[]内编号(不是PID)回车)
[INFO] arthas-boot version: 3.1.4
[INFO] Found existing java process, please choose one and hit RETURN.
* [1]: 11616 com.Arthas
  [2]: 8676
  [3]: 16200 org.jetbrains.jps.cmdline.Launcher
  [4]: 21032 org.jetbrains.idea.maven.server.RemoteMavenServer

# 运行方式2，运行时选择 Java 进程 PID
java -jar arthas-boot.jar [PID]
</code>
		</pre>
		<p>查看 PID 的方式可以通过 <code>ps</code> 命令，也可以通过 JDK 提供的 <code>jps</code>命令。</p>
		<pre>
			<code class="language-shell"># 查看运行的 java 进程信息
$ jps -mlvV 
# 筛选 java 进程信息
$ jps -mlvV | grep [xxx]
</code>
		</pre>
		<p>
			<code>jps</code> 筛选想要的进程方式。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1570979767404.png" alt="jps 筛选进程"/>
		</p>
		<p>在出现 <strong>Arthas</strong> Logo 之后就可以使用命令进行问题诊断了。下面会详细介绍。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/image-20191106003512451.png" alt="Arthas 启动"/>
		</p>
		<p>更多的启动方式可以参考 help 帮助命令。</p>
		<pre>
			<code class="language-shell"># 其他用法
EXAMPLES:
  java -jar arthas-boot.jar &lt;pid&gt;
  java -jar arthas-boot.jar --target-ip 0.0.0.0
  java -jar arthas-boot.jar --telnet-port 9999 --http-port -1
  java -jar arthas-boot.jar --tunnel-server 'ws://192.168.10.11:7777/ws'
  java -jar arthas-boot.jar --tunnel-server 'ws://192.168.10.11:7777/ws'
--agent-id bvDOe8XbTM2pQWjF4cfw
  java -jar arthas-boot.jar --stat-url 'http://192.168.10.11:8080/api/stat'
  java -jar arthas-boot.jar -c 'sysprop; thread' &lt;pid&gt;
  java -jar arthas-boot.jar -f batch.as &lt;pid&gt;
  java -jar arthas-boot.jar --use-version 3.1.4
  java -jar arthas-boot.jar --versions
  java -jar arthas-boot.jar --session-timeout 3600
  java -jar arthas-boot.jar --attach-only
  java -jar arthas-boot.jar --repo-mirror aliyun --use-http
</code>
		</pre>
		<h2 id="33-web-console">3.3 web console</h2>
		<p>
			<strong>Arthas</strong> 目前支持 <code>Web Console</code>，在成功启动连接进程之后就已经自动启动，可以直接访问 http://127.0.0.1:8563/ 访问，页面上的操作模式和控制台完全一样。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1570979937637.png" alt="1570979937637"/>
		</p>
		<h2 id="34-常用命令">3.4 常用命令</h2>
		<p>下面列举一些 <a href="https://www.wdbyte.com/2019/11/arthas/" target="_blank">
				<strong>Arthas</strong>
			</a> 的常用命令，看到这里你可能还不知道怎么使用，别急，后面会一一介绍。</p>
		<table class="ui celled table">
			<thead>
				<tr>
					<th>命令</th>
					<th>介绍</th>
				</tr>
			</thead>
			<tbody>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/dashboard.html" target="_blank">dashboard</a>
					</td>
					<td>当前系统的实时数据面板</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/thread.html" target="_blank">
							<strong>thread</strong>
						</a>
					</td>
					<td>查看当前 JVM 的线程堆栈信息</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/watch.html" target="_blank">
							<strong>watch</strong>
						</a>
					</td>
					<td>方法执行数据观测</td>
				</tr>
				<tr>
					<td>
						<strong>
							<a href="https://alibaba.github.io/arthas/trace.html" target="_blank">trace</a>
						</strong>
					</td>
					<td>方法内部调用路径，并输出方法路径上的每个节点上耗时</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/stack.html" target="_blank">
							<strong>stack</strong>
						</a>
					</td>
					<td>输出当前方法被调用的调用路径</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/tt.html" target="_blank">
							<strong>tt</strong>
						</a>
					</td>
					<td>方法执行数据的时空隧道，记录下指定方法每次调用的入参和返回信息，并能对这些不同的时间下调用进行观测</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/monitor.html" target="_blank">monitor</a>
					</td>
					<td>方法执行监控</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/jvm.html" target="_blank">jvm</a>
					</td>
					<td>查看当前 JVM 信息</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/vmoption.html" target="_blank">vmoption</a>
					</td>
					<td>查看，更新 JVM 诊断相关的参数</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/sc.html" target="_blank">sc</a>
					</td>
					<td>查看 JVM 已加载的类信息</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/sm.html" target="_blank">sm</a>
					</td>
					<td>查看已加载类的方法信息</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/jad.html" target="_blank">jad</a>
					</td>
					<td>反编译指定已加载类的源码</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/classloader.html" target="_blank">classloader</a>
					</td>
					<td>查看 classloader 的继承树，urls，类加载信息</td>
				</tr>
				<tr>
					<td>
						<a href="https://alibaba.github.io/arthas/heapdump.html" target="_blank">heapdump</a>
					</td>
					<td>类似 jmap 命令的 heap dump 功能</td>
				</tr>
			</tbody>
		</table>
		<h2 id="35-退出">3.5 退出</h2>
		<p>使用 shutdown 退出时 <strong>Arthas</strong> 同时自动重置所有增强过的类 。</p>
		<h1 id="4arthas-常用操作">4、Arthas 常用操作</h1>
		<p>上面已经了解了什么是  <strong>Arthas</strong>，以及  <strong>Arthas</strong> 的启动方式，下面会依据一些情况，详细说一说 <strong>Arthas</strong> 的使用方式。在使用命令的过程中如果有问题，每个命令都可以是 <code>-h</code> 查看帮助信息。</p>
		<p>首先编写一个有各种情况的测试类运行起来，再使用 <strong>Arthas</strong> 进行问题定位，</p>
		<pre>
			<code class="language-java">import java.util.HashSet;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import lombok.extern.slf4j.Slf4j;

/**
 * &lt;p&gt;
 * Arthas Demo
 * 公众号：程序猿阿朗
 *
 * @Author niujinpeng
 */
@Slf4j
public class Arthas {

private static HashSet hashSet = new HashSet();
    /** 线程池，大小1*/
    private static ExecutorService executorService = Executors.newFixedThreadPool(1);

public static void main(String[] args) {
        // 模拟 CPU 过高，这里注释掉了，测试时可以打开
        // cpu();
        // 模拟线程阻塞
        thread();
        // 模拟线程死锁
        deadThread();
        // 不断的向 hashSet 集合增加数据
        addHashSetThread();
    }

/**
     * 不断的向 hashSet 集合添加数据
     */
    public static void addHashSetThread() {
        // 初始化常量
        new Thread(() -&gt; {
            int count = 0;
            while (true) {
                try {
                    hashSet.add(&quot;count&quot; + count);
                    Thread.sleep(10000);
                    count++;
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }).start();
    }

public static void cpu() {
        cpuHigh();
        cpuNormal();
    }

/**
     * 极度消耗CPU的线程
     */
    private static void cpuHigh() {
        Thread thread = new Thread(() -&gt; {
            while (true) {
                log.info(&quot;cpu start 100&quot;);
            }
        });
        // 添加到线程
        executorService.submit(thread);
    }

/**
     * 普通消耗CPU的线程
     */
    private static void cpuNormal() {
        for (int i = 0; i &lt; 10; i++) {
            new Thread(() -&gt; {
                while (true) {
                    log.info(&quot;cpu start&quot;);
                    try {
                        Thread.sleep(3000);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                }
            }).start();
        }
    }

/**
     * 模拟线程阻塞,向已经满了的线程池提交线程
     */
    private static void thread() {
        Thread thread = new Thread(() -&gt; {
            while (true) {
                log.debug(&quot;thread start&quot;);
                try {
                    Thread.sleep(3000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        });
        // 添加到线程
        executorService.submit(thread);
    }

/**
     * 死锁
     */
    private static void deadThread() {
        /** 创建资源 */
        Object resourceA = new Object();
        Object resourceB = new Object();
        // 创建线程
        Thread threadA = new Thread(() -&gt; {
            synchronized (resourceA) {
                log.info(Thread.currentThread() + &quot; get ResourceA&quot;);
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                log.info(Thread.currentThread() + &quot;waiting get resourceB&quot;);
                synchronized (resourceB) {
                    log.info(Thread.currentThread() + &quot; get resourceB&quot;);
                }
            }
        });

Thread threadB = new Thread(() -&gt; {
            synchronized (resourceB) {
                log.info(Thread.currentThread() + &quot; get ResourceB&quot;);
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                log.info(Thread.currentThread() + &quot;waiting get resourceA&quot;);
                synchronized (resourceA) {
                    log.info(Thread.currentThread() + &quot; get resourceA&quot;);
                }
            }
        });
        threadA.start();
        threadB.start();
    }
}
</code>
		</pre>
		<h2 id="41-全局监控">4.1 全局监控</h2>
		<p>使用 <strong>dashboard</strong> 命令可以概览程序的 线程、内存、GC、运行环境信息。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571212470373.png" alt="dashboard"/>
		</p>
		<h2 id="42-cpu-为什么起飞了">4.2 CPU 为什么起飞了</h2>
		<p>上面的代码例子有一个 <code>CPU</code> 空转的死循环，非常的消耗 <code>CPU性能</code>，那么怎么找出来呢？</p>
		<p>使用 <strong>thread</strong>查看<strong>所有</strong>线程信息，同时会列出每个线程的 <code>CPU</code> 使用率，可以看到图里 ID 为12 的线程 CPU 使用100%。
<img src="https://cdn.debug.group/git/2019/1570983440457.png" alt=""/>
		</p>
		<p>使用命令 <strong>thread 12</strong> 查看 CPU 消耗较高的 12 号线程信息，可以看到 CPU 使用较高的方法和行数（这里的行数可能和上面代码里的行数有区别，因为上面的代码在我写文章时候重新排过版了）。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1570983401254.png" alt=""/>
		</p>
		<p>上面是先通过观察总体的线程信息，然后查看具体的线程运行情况。如果只是为了寻找 CPU 使用较高的线程，可以直接使用命令 <strong>thread -n [显示的线程个数]</strong> ，就可以排列出 CPU 使用率 <strong>Top N</strong> 的线程。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1570983061047.png" alt=""/>
		</p>
		<p>定位到的 CPU 使用最高的方法。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571016675083.png" alt=""/>
		</p>
		<h2 id="43-线程池线程状态">4.3 线程池线程状态</h2>
		<p>定位线程问题之前，先回顾一下线程的几种常见状态：</p>
		<ul>
			<li>
				<strong>RUNNABLE</strong> 运行中</li>
			<li>
				<strong>TIMED_WAITIN</strong> 调用了以下方法的线程会进入<strong>TIMED_WAITING</strong>：
<ol>
					<li>Thread#sleep()</li>
					<li>Object#wait() 并加了超时参数</li>
					<li>Thread#join() 并加了超时参数</li>
					<li>LockSupport#parkNanos()</li>
					<li>LockSupport#parkUntil()</li>
				</ol>
			</li>
			<li>
				<strong>WAITING</strong> 当线程调用以下方法时会进入WAITING状态：
<ol>
					<li>Object#wait() 而且不加超时参数</li>
					<li>Thread#join() 而且不加超时参数</li>
					<li>LockSupport#park()</li>
				</ol>
			</li>
			<li>
				<strong>BLOCKED</strong> 阻塞，等待锁</li>
		</ul>
		<p>上面的模拟代码里，定义了线程池大小为1 的线程池，然后在 <code>cpuHigh</code> 方法里提交了一个线程，在 <code>thread</code>方法再次提交了一个线程，后面的这个线程因为线程池已满，会阻塞下来。</p>
		<p>使用 <strong>thread | grep pool</strong> 命令查看线程池里线程信息。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571020871537.png" alt=""/>
		</p>
		<p>可以看到线程池有 <strong>WAITING</strong> 的线程。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571021838323.png" alt=""/>
		</p>
		<h2 id="44-线程死锁">4.4 线程死锁</h2>
		<p>上面的模拟代码里 <code>deadThread </code>方法实现了一个死锁，使用 <strong>thread -b</strong> 命令查看直接定位到死锁信息。</p>
		<pre>
			<code class="language-java">/**
 * 死锁
 */
private static void deadThread() {
    /** 创建资源 */
    Object resourceA = new Object();
    Object resourceB = new Object();
    // 创建线程
    Thread threadA = new Thread(() -&gt; {
        synchronized (resourceA) {
            log.info(Thread.currentThread() + &quot; get ResourceA&quot;);
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            log.info(Thread.currentThread() + &quot;waiting get resourceB&quot;);
            synchronized (resourceB) {
                log.info(Thread.currentThread() + &quot; get resourceB&quot;);
            }
        }
    });

modifierprivate,static
                   type    java.util.HashSet
                   name    hashSet
                   value   [count1, count2]

modifierprivate,static
                   type    java.util.concurrent.ExecutorService
                   name    executorService
                   value   java.util.concurrent.ThreadPoolExecutor@71c03156[Ru
                           nning, pool size = 1, active threads = 1, queued ta
                           sks = 0, completed tasks = 0]

Affect(row-cnt:1) cost in 9 ms.
</code>
		</pre>
		<h2 id="47-查看方法信息">4.7 查看方法信息</h2>
		<p>使用 <strong>sm</strong> 命令查看类的方法信息。</p>
		<pre>
			<code class="language-shell">[arthas@22180]$ sm com.Arthas
com.Arthas &lt;init&gt;()V
com.Arthas start()V
com.Arthas thread()V
com.Arthas deadThread()V
com.Arthas lambda$cpuHigh$1()V
com.Arthas cpuHigh()V
com.Arthas lambda$thread$3()V
com.Arthas addHashSetThread()V
com.Arthas cpuNormal()V
com.Arthas cpu()V
com.Arthas lambda$addHashSetThread$0()V
com.Arthas lambda$deadThread$4(Ljava/lang/Object;Ljava/lang/Object;)V
com.Arthas lambda$deadThread$5(Ljava/lang/Object;Ljava/lang/Object;)V
com.Arthas lambda$cpuNormal$2()V
Affect(row-cnt:16) cost in 6 ms.
</code>
		</pre>
		<h2 id="48-对变量的值很是好奇">4.8 对变量的值很是好奇</h2>
		<p>使用 <strong>ognl</strong> 命令，ognl 表达式可以轻松操作想要的信息。</p>
		<p>代码还是上面的示例代码，我们查看变量 <code>hashSet</code> 中的数据：</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571196786678.png" alt=""/>
		</p>
		<p>查看静态变量 <code>hashSet</code> 信息。</p>
		<pre>
			<code class="language-shell">[arthas@19856]$ ognl '@com.Arthas@hashSet'
@HashSet[
    @String[count1],
    @String[count2],
    @String[count29],
    @String[count28],
    @String[count0],
    @String[count27],
    @String[count5],
    @String[count26],
    @String[count6],
    @String[count25],
    @String[count3],
    @String[count24],
</code>
		</pre>
		<p>查看静态变量 hashSet 大小。</p>
		<pre>
			<code class="language-shell">[arthas@19856]$ ognl '@<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="dfbcb0b2f19eadabb7beac9fb7beacb78cbaabf1acb6a5ba">[email&#160;protected]</a>()'
	@Integer[57]
</code>
		</pre>
		<p>甚至可以进行操作。</p>
		<pre>
			<code class="language-shell">[arthas@19856]$ ognl  '@<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="01626e6c2f40737569607241696072695264752f606565">[email&#160;protected]</a>(&quot;test&quot;)'
	@Boolean[true]
[arthas@19856]$
# 查看添加的字符
[arthas@19856]$ ognl  '@com.Arthas@hashSet' | grep test
    @String[test],
[arthas@19856]$
</code>
		</pre>
		<p>
			<code>ognl</code> 可以做很多事情，可以参考 <a href="https://github.com/alibaba/arthas/issues/71" target="_blank">ognl 表达式特殊用法( https://github.com/alibaba/arthas/issues/71 )</a>。</p>
		<h2 id="49-程序有没有问题">4.9 程序有没有问题</h2>
		<h3 id="491-运行较慢耗时较长">4.9.1 运行较慢、耗时较长</h3>
		<p>使用 <strong>trace</strong> 命令可以跟踪统计方法耗时</p>
		<p>这次换一个模拟代码。一个最基础的 Springboot 项目（当然，不想 Springboot 的话，你也可以直接在 UserController 里 main 方法启动）控制层 <code>getUser</code> 方法调用了 <code>userService.get(uid);</code>，这个方法中分别进行<code>check</code>、<code>service</code>、<code>redis</code>、<code>mysql</code>操作。</p>
		<pre>
			<code class="language-java">@RestController
@Slf4j
public class UserController {

@Autowired
    private UserServiceImpl userService;

@GetMapping(value = &quot;/user&quot;)
    public HashMap&lt;String, Object&gt; getUser(Integer uid) throws Exception {
        // 模拟用户查询
        userService.get(uid);
        HashMap&lt;String, Object&gt; hashMap = new HashMap&lt;&gt;();
        hashMap.put(&quot;uid&quot;, uid);
        hashMap.put(&quot;name&quot;, &quot;name&quot; + uid);
        return hashMap;
    }
}
</code>
		</pre>
		<p>模拟代码 Service:</p>
		<pre>
			<code class="language-java">@Service
@Slf4j
public class UserServiceImpl {

public void get(Integer uid) throws Exception {
        check(uid);
        service(uid);
        redis(uid);
        mysql(uid);
    }

public void service(Integer uid) throws Exception {
        int count = 0;
        for (int i = 0; i &lt; 10; i++) {
            count += i;
        }
        log.info(&quot;service  end {}&quot;, count);
    }

public void redis(Integer uid) throws Exception {
        int count = 0;
        for (int i = 0; i &lt; 10000; i++) {
            count += i;
        }
        log.info(&quot;redis  end {}&quot;, count);
    }

public void mysql(Integer uid) throws Exception {
        long count = 0;
        for (int i = 0; i &lt; 10000000; i++) {
            count += i;
        }
        log.info(&quot;mysql end {}&quot;, count);
    }

public boolean check(Integer uid) throws Exception {
         if (uid == null || uid &lt; 0) {
             log.error(&quot;uid不正确，uid:{}&quot;, uid);
             throw new Exception(&quot;uid不正确&quot;);
         }
         return true;
     }
}

</code>
		</pre>
		<p>运行 Springboot 之后，使用 **trace== ** 命令开始检测耗时情况。</p>
		<pre>
			<code class="language-shell">[arthas@6592]$ trace com.UserController getUser
</code>
		</pre>
		<p>访问接口 <code>/getUser</code> ，可以看到耗时信息，看到 <code>com.UserServiceImpl:get() </code>方法耗时较高。
<img src="https://cdn.debug.group/git/2019/1571208153793.png" alt=""/>
		</p>
		<p>继续跟踪耗时高的方法，然后再次访问。</p>
		<pre>
			<code class="language-shell">[arthas@6592]$ trace com.UserServiceImpl get
</code>
		</pre>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571208245597.png" alt=""/>
		</p>
		<p>很清楚的看到是 <code>com.UserServiceImpl </code>的  <code>mysql </code>方法耗时是最高的。</p>
		<pre>
			<code class="language-java">Affect(class-cnt:1 , method-cnt:1) cost in 31 ms.
`---ts=2019-10-16 14:40:10;thread_name=http-nio-8080-exec-8;id=1f;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@23a918c7
    `---[6.792201ms] com.UserServiceImpl:get()
        +---[0.008ms] com.UserServiceImpl:check() #17
        +---[0.076ms] com.UserServiceImpl:service() #18
        +---[0.1089ms] com.UserServiceImpl:redis() #19
        `---[6.528899ms] com.UserServiceImpl:mysql() #20
</code>
		</pre>
		<h3 id="492-统计方法耗时">4.9.2 统计方法耗时</h3>
		<p>使用 <strong>monitor</strong> 命令监控统计方法的执行情况。</p>
		<p>每5秒统计一次 <code>com.UserServiceImpl</code> 类的 <code>get</code> 方法执行情况。</p>
		<pre>
			<code class="language-shell">monitor -c 5 com.UserServiceImpl get
</code>
		</pre>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571210158018.png" alt=""/>
		</p>
		<h2 id="410-想观察方法信息">4.10 想观察方法信息</h2>
		<p>下面的示例用到了文章的前两个模拟代码。</p>
		<h3 id="4101-观察方法的入参出参信息">4.10.1 观察方法的入参出参信息</h3>
		<p>使用 <strong>watch</strong> 命令轻松查看输入输出参数以及异常等信息。</p>
		<pre>
			<code class="language-shell"> USAGE:
   watch [-b] [-e] [-x &lt;value&gt;] [-f] [-h] [-n &lt;value&gt;] [-E] [-M &lt;value&gt;] [-s] class-pattern method-pattern express [condition-express]

SUMMARY:
   Display the input/output parameter, return object, and thrown exception of specified method invocation
   The express may be one of the following expression (evaluated dynamically):
           target : the object
            clazz : the object's class
           method : the constructor or method
           params : the parameters array of method
     params[0..n] : the element of parameters array
        returnObj : the returned object of method
         throwExp : the throw exception of method
         isReturn : the method ended by return
          isThrow : the method ended by throwing exception
            #cost : the execution time in ms of method invocation
 Examples:
   watch -b org.apache.commons.lang.StringUtils isBlank params
   watch -f org.apache.commons.lang.StringUtils isBlank returnObj
   watch org.apache.commons.lang.StringUtils isBlank '{params, target, returnObj}' -x 2
   watch -bf *StringUtils isBlank params
   watch *StringUtils isBlank params[0]
   watch *StringUtils isBlank params[0] params[0].length==1
   watch *StringUtils isBlank params '#cost&gt;100'
   watch -E -b org\.apache\.commons\.lang\.StringUtils isBlank params[0]

WIKI:
   https://alibaba.github.io/arthas/watch
</code>
		</pre>
		<p>常用操作：</p>
		<pre>
			<code class="language-shell"># 查看入参和出参
$ watch com.Arthas addHashSet '{params[0],returnObj}'
# 查看入参和出参大小
$ watch com.Arthas addHashSet '{params[0],returnObj.size}'
# 查看入参和出参中是否包含 'count10'
$ watch com.Arthas addHashSet '{params[0],returnObj.contains(&quot;count10&quot;)}'
# 查看入参和出参，出参 toString
$ watch com.Arthas addHashSet '{params[0],returnObj.toString()}'
</code>
		</pre>
		<p>查看入参出参。</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571196483469.png" alt=""/>
		</p>
		<p>查看返回的异常信息。</p>
		<h3 id="4102-观察方法的调用路径">4.10.2 观察方法的调用路径</h3>
		<p>使用 <strong>stack</strong>命令查看方法的调用信息。</p>
		<pre>
			<code class="language-shell"># 观察 类com.UserServiceImpl的 mysql 方法调用路径
stack com.UserServiceImpl mysql
</code>
		</pre>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571210706602.png" alt=""/>
		</p>
		<h3 id="4103-方法调用时空隧道">4.10.3 方法调用时空隧道</h3>
		<p>使用 <strong>tt</strong> 命令记录方法执行的详细情况。</p>
		<blockquote>
			<p>
				<strong>tt</strong> 命令方法执行数据的时空隧道，记录下指定方法每次调用的入参和返回信息，并能对这些不同的时间下调用进行观测 。</p>
		</blockquote>
		<p>常用操作：</p>
		<p>开始记录方法调用信息：tt -t com.UserServiceImpl check</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571212007249.png" alt=""/>
		</p>
		<p>可以看到记录中 INDEX=1001 的记录的 IS-EXP = true ，说明这次调用出现异常。</p>
		<p>查看记录的方法调用信息： tt -l</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571212080071.png" alt=""/>
		</p>
		<p>查看调用记录的详细信息（-i 指定 INDEX）： tt -i 1001</p>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571212151064.png" alt=""/>
		</p>
		<p>可以看到 INDEX=1001 的记录的异常信息。</p>
		<p>重新发起调用，使用指定记录，使用 -p 重新调用。</p>
		<pre>
			<code class="language-java">tt -i 1001 -p
</code>
		</pre>
		<p>
			<img src="https://cdn.debug.group/git/2019/1571212227058.png" alt=""/>
		</p>
		<h2 id="45-火焰图分析">4.5. 火焰图分析</h2>
		<p>最近 Arthas 性能分析工具上线了<strong>火焰图</strong>分析功能，Arthas 使用 <strong>async-profiler</strong> 生成 CPU/内存火焰图进行性能分析，弥补了之前内存分析的不足。在 Arthas 上使用还是比较方便的。</p>
		<p>
			<strong>
				<code>profiler</code>
			</strong> 命令支持生成应用热点的火焰图。本质上是通过不断的采样，然后把收集到的采样结果生成火焰图。</p>
		<p>
			<strong>
				<code>profiler</code>
			</strong> 命令基本运行结构是 <strong>
				<code>profiler action [actionArg]</code>
			</strong>
		</p>
		<h3 id="451使用案例">
			<strong>4.5.1.使用案例</strong>
		</h3>
		<p>
			<strong>开启 prifilter</strong>
		</p>
		<p>默认情况下，生成的是cpu的火焰图，即event为<code>cpu</code>。可以用<code>--event</code>参数来指定，使用 start 命令开始捕获信息。</p>
		<pre>
			<code class="language-shell">$ profiler start
Started [cpu] profiling
</code>
		</pre>
		<p>获取已采集的sample的数量</p>
		<pre>
			<code class="language-shell">$ profiler getSamples
23
</code>
		</pre>
		<p>查看 profiler状态，可以查看当前 profiler 在采样哪种 <code>event </code>和进行的采样时间。</p>
		<pre>
			<code class="language-shell">$ profiler status

[cpu] profiling is running for 4 seconds
</code>
		</pre>
		<p>
			<strong>停止profiler</strong>
		</p>
		<p>生成svg格式火焰图</p>
		<pre>
			<code class="language-shell">$ profiler stop
profiler output file: /tmp/demo/arthas-output/20191125-135546.svg
OK
</code>
		</pre>
		<p>默认情况下，生成的结果保存到应用的<code>工作目录</code>下的<code>arthas-output</code>目录。可以通过 <code>--file</code>参数来指定输出结果路径。</p>
		<p>比如：</p>
		<pre>
			<code class="language-shell">$ profiler stop --file /tmp/output.svg
</code>
		</pre>
		<p>
			<strong>HTML 格式输出</strong>
		</p>
		<p>默认情况下，结果文件是<code>svg</code>格式，如果想生成<code>html</code>格式，可以用<code>--format</code>参数指定：$ profiler stop --format html</p>
		<p>
			<strong>查看 profilter</strong>
		</p>
		<p>默认情况下，arthas使用3658端口，则可以打开： http://localhost:3658/arthas-output/ 查看到<code>arthas-output</code>目录下面的profiler结果：</p>
		<p>
			<img src="https://cdn.debug.group/img/23/10/151226645.webp" alt=""/>
		</p>
		<p>点击可以查看具体的结果：<strong>火焰图里，横条越长，代表使用的越多，从下到上是调用堆栈信息</strong>
		</p>
		<p>
			<img src="https://cdn.debug.group/img/23/10/151303153.webp" alt=""/>
		</p>
		<p>**profilter 自持多种分析方式，**常见的有 event: cpu|alloc|lock|cache-misses etc. 比如要分析内存使用情况。</p>
		<p>$ profiler start --event alloc</p>
		<h3 id="452-复杂命令">4.5.2. 复杂命令</h3>
		<p>比如开始采样：</p>
		<pre>
			<code class="language-shell">profiler execute 'start'
</code>
		</pre>
		<p>停止采样，并保存到指定文件里：</p>
		<pre>
			<code class="language-shell">profiler execute 'stop,file=/tmp/result.svg'
</code>
		</pre>
		<p>文中代码已经上传到 <a href="https://github.com/niumoo/lab-notes/" target="_blank">Github</a>。</p>
	</div>
</div>

Uploading file...

Edit message:

Cancel

Editing Arthas-guide

Sidebar